vllm/vllm/core
leiwen83 e64fde4b01
[Core][Bugfix]: fix prefix caching for blockv2 (#4764)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
2024-05-24 10:07:09 -07:00
..
block [Core][Bugfix]: fix prefix caching for blockv2 (#4764) 2024-05-24 10:07:09 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
block_manager_v1.py [Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659) 2024-05-08 12:07:05 -07:00
block_manager_v2.py [Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659) 2024-05-08 12:07:05 -07:00
embedding_model_block_manager.py [Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
evictor_v1.py [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
evictor_v2.py [mypy][6/N] Fix all the core subdirectory typing (#4450) 2024-05-02 03:01:00 +00:00
interfaces.py [Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
policy.py [Chunked Prefill][4/n] Chunked prefill scheduler. (#3853) 2024-04-05 10:17:58 -07:00
scheduler.py [Core] Fix scheduler considering "no LoRA" as "LoRA" (#4897) 2024-05-20 17:48:32 -07:00