vllm/vllm/engine
youkaichao 8438e0569e
[Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024)
[Core] replace narrow-usage RayWorkerVllm to general WorkerWrapper to reduce code duplication (#4024)
2024-04-17 08:34:33 +00:00
..
output_processor [Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894) 2024-04-16 13:09:21 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
arg_utils.py [Core] Refactor model loading code (#4097) 2024-04-16 11:34:39 -07:00
async_llm_engine.py [Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894) 2024-04-16 13:09:21 -07:00
llm_engine.py [Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894) 2024-04-16 13:09:21 -07:00
metrics.py [Bugfix] fix_log_time_in_metrics (#4050) 2024-04-13 07:52:36 -07:00
ray_utils.py [Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024) 2024-04-17 08:34:33 +00:00