vllm/tests/core
leiwen83 24750f4cad
[Core] Enable prefix caching with block manager v2 enabled (#4142)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
Co-authored-by: Sage Moore <sagemoore@utexas.edu>
2024-05-01 11:20:32 -07:00
..
block [Core] Enable prefix caching with block manager v2 enabled (#4142) 2024-05-01 11:20:32 -07:00
__init__.py [Tests] Add block manager and scheduler tests (#3108) 2024-03-05 18:23:34 -08:00
test_block_manager.py [Speculative decoding 4/9] Lookahead scheduling for speculative decoding (#3250) 2024-04-01 22:55:24 +00:00
test_chunked_prefill_scheduler.py [Core][5/N] Fully working chunked prefill e2e (#3884) 2024-04-10 17:56:48 -07:00
test_scheduler.py [Core] Scheduling optimization 2 (#4280) 2024-04-23 08:02:11 +00:00
utils.py [Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894) 2024-04-16 13:09:21 -07:00