vllm/core at e64fde4b013cb8bb2321f59ba78aca50b02071cb - vllm

History

leiwen83 e64fde4b01 [Core][Bugfix]: fix prefix caching for blockv2 (#4764 ) Co-authored-by: Lei Wen <wenlei03@qiyi.com>		2024-05-24 10:07:09 -07:00
..
block	[Core][Bugfix]: fix prefix caching for blockv2 (#4764 )	2024-05-24 10:07:09 -07:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
block_manager_v1.py	[Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659 )	2024-05-08 12:07:05 -07:00
block_manager_v2.py	[Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659 )	2024-05-08 12:07:05 -07:00
embedding_model_block_manager.py	[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )	2024-05-11 11:30:37 -07:00
evictor_v1.py	[Core] Enable prefix caching with block manager v2 enabled (#4142 )	2024-05-01 11:20:32 -07:00
evictor_v2.py	[mypy][6/N] Fix all the core subdirectory typing (#4450 )	2024-05-02 03:01:00 +00:00
interfaces.py	[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )	2024-05-11 11:30:37 -07:00
policy.py	[Chunked Prefill][4/n] Chunked prefill scheduler. (#3853 )	2024-04-05 10:17:58 -07:00
scheduler.py	[Core] Fix scheduler considering "no LoRA" as "LoRA" (#4897 )	2024-05-20 17:48:32 -07:00