vllm/engine at 8438e0569eaf8496aa3d41deb808f2c831b64ecf - vllm

History

youkaichao 8438e0569e [Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024 ) [Core] replace narrow-usage RayWorkerVllm to general WorkerWrapper to reduce code duplication (#4024)		2024-04-17 08:34:33 +00:00
..
output_processor	[Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894 )	2024-04-16 13:09:21 -07:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
arg_utils.py	[Core] Refactor model loading code (#4097 )	2024-04-16 11:34:39 -07:00
async_llm_engine.py	[Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894 )	2024-04-16 13:09:21 -07:00
llm_engine.py	[Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894 )	2024-04-16 13:09:21 -07:00
metrics.py	[Bugfix] fix_log_time_in_metrics (#4050 )	2024-04-13 07:52:36 -07:00
ray_utils.py	[Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024 )	2024-04-17 08:34:33 +00:00