vllm/core at 2eedede87502be64f60962147513f3df6cb1bd01 - vllm

History

Megha Agarwal 2eedede875 [Core] Asynchronous Output Processor (#7049 ) Co-authored-by: Alexander Matveev <alexm@neuralmagic.com>		2024-08-26 20:53:20 -07:00
..
block	[Performance][BlockManagerV2] Mark prefix cache block as computed after schedule (#7822 )	2024-08-26 11:24:53 -07:00
__init__.py	[Tests] Add block manager and scheduler tests (#3108 )	2024-03-05 18:23:34 -08:00
test_block_manager.py	[Core] Avoid the need to pass `None` values to `Sequence.inputs` (#5099 )	2024-05-29 16:05:01 -07:00
test_chunked_prefill_scheduler.py	[Core] Asynchronous Output Processor (#7049 )	2024-08-26 20:53:20 -07:00
test_scheduler_encoder_decoder.py	[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )	2024-08-06 16:51:47 -04:00
test_scheduler.py	[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )	2024-08-06 16:51:47 -04:00
test_serialization.py	[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )	2024-08-18 17:57:20 -07:00
utils.py	[Core] Asynchronous Output Processor (#7049 )	2024-08-26 20:53:20 -07:00