vllm/core at 7dbe738d653b563c646883c1ae6f6df927436d01 - vllm

History

Kuntai Du 81ede99ca4 [Core] Deprecating block manager v1 and make block manager v2 default (#8704 ) Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).		2024-10-17 11:38:15 -05:00
..
block	[Core] Deprecating block manager v1 and make block manager v2 default (#8704 )	2024-10-17 11:38:15 -05:00
__init__.py	[Tests] Add block manager and scheduler tests (#3108 )	2024-03-05 18:23:34 -08:00
test_chunked_prefill_scheduler.py	[Core] Deprecating block manager v1 and make block manager v2 default (#8704 )	2024-10-17 11:38:15 -05:00
test_num_computed_tokens_update.py	[Core] Deprecating block manager v1 and make block manager v2 default (#8704 )	2024-10-17 11:38:15 -05:00
test_scheduler_encoder_decoder.py	[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )	2024-08-06 16:51:47 -04:00
test_scheduler.py	[Core] Deprecating block manager v1 and make block manager v2 default (#8704 )	2024-10-17 11:38:15 -05:00
test_serialization.py	[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )	2024-08-18 17:57:20 -07:00
utils.py	[core] remove beam search from the core (#9105 )	2024-10-07 05:47:04 +00:00