vllm/v1 at 6c1208d083fbaaf89c6d812f4d3424e15182f652 - vllm

History

youkaichao 803f37eaaa [6/N] torch.compile rollout to users (#10437 ) Signed-off-by: youkaichao <youkaichao@gmail.com>		2024-11-19 10:09:03 -08:00
..
attention	[V1] Make v1 more testable (#9888 )	2024-11-06 11:57:35 -08:00
core	[V1] Support VLMs with fine-grained scheduling (#9871 )	2024-11-13 04:53:13 +00:00
engine	[Misc] Fix import error in tensorizer tests and cleanup some code (#10349 )	2024-11-15 09:34:17 +00:00
executor	[V1] Fix Configs (#9971 )	2024-11-04 00:24:40 +00:00
sample	[V1] Support per-request seed (#9945 )	2024-11-03 09:14:17 -08:00
worker	[6/N] torch.compile rollout to users (#10437 )	2024-11-19 10:09:03 -08:00
__init__.py	[V1] `AsyncLLM` Implementation (#9826 )	2024-11-11 23:05:38 +00:00
outputs.py	[V1] Implement vLLM V1 [1/N] (#9289 )	2024-10-22 01:24:07 -07:00
request.py	[1/N] Initial prototype for multi-modal processor (#10044 )	2024-11-13 12:39:03 +00:00
serial_utils.py	[V1] Use pickle for serializing EngineCoreRequest & Add multimodal inputs to EngineCoreRequest (#10245 )	2024-11-12 08:57:14 -08:00
utils.py	[V1] Add all_token_ids attribute to Request (#10135 )	2024-11-07 17:08:24 -08:00