vllm/core at 0d62fe58dbb58cfe4132005ce7ff37319d66981d - vllm

History

leiwen83 24750f4cad [Core] Enable prefix caching with block manager v2 enabled (#4142 ) Co-authored-by: Lei Wen <wenlei03@qiyi.com> Co-authored-by: Sage Moore <sagemoore@utexas.edu>		2024-05-01 11:20:32 -07:00
..
block	[Core] Enable prefix caching with block manager v2 enabled (#4142 )	2024-05-01 11:20:32 -07:00
__init__.py	[Tests] Add block manager and scheduler tests (#3108 )	2024-03-05 18:23:34 -08:00
test_block_manager.py	[Speculative decoding 4/9] Lookahead scheduling for speculative decoding (#3250 )	2024-04-01 22:55:24 +00:00
test_chunked_prefill_scheduler.py	[Core][5/N] Fully working chunked prefill e2e (#3884 )	2024-04-10 17:56:48 -07:00
test_scheduler.py	[Core] Scheduling optimization 2 (#4280 )	2024-04-23 08:02:11 +00:00
utils.py	[Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894 )	2024-04-16 13:09:21 -07:00