vllm/cacheflow
2023-05-09 16:03:44 -07:00
..
core Enhance model loader (#83) 2023-05-09 15:46:42 -07:00
frontend Use slow tokenizer for LLaMA (#84) 2023-05-09 16:03:44 -07:00
model_executor Enhance model loader (#83) 2023-05-09 15:46:42 -07:00
worker Refactor system architecture (#82) 2023-05-09 15:30:12 -07:00
block.py Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
logger.py Add a system logger (#85) 2023-05-08 23:03:35 -07:00
sampling_params.py FastAPI-based working frontend (#10) 2023-03-29 14:48:56 +08:00
sequence.py Collect system stats in scheduler & Add scripts for experiments (#30) 2023-04-12 15:03:49 -07:00
utils.py Refactor system architecture (#82) 2023-05-09 15:30:12 -07:00