vllm/vllm/core
leiwen83 24750f4cad
[Core] Enable prefix caching with block manager v2 enabled (#4142)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
Co-authored-by: Sage Moore <sagemoore@utexas.edu>
2024-05-01 11:20:32 -07:00
..
block [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
block_manager_v1.py [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
block_manager_v2.py [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
evictor_v1.py [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
evictor_v2.py [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
interfaces.py [Typing] Fix Sequence type GenericAlias only available after Python 3.9. (#4092) 2024-04-15 14:47:31 -07:00
policy.py [Chunked Prefill][4/n] Chunked prefill scheduler. (#3853) 2024-04-05 10:17:58 -07:00
scheduler.py Add more Prometheus metrics (#2764) 2024-04-28 15:59:33 -07:00