vllm/vllm/engine
Mahesh Keralapura 93478b63d2
[Core] Fix tracking of model forward time in case of PP>1 (#7440)
[Core] Fix tracking of model forward time to the span traces in case of PP>1 (#7440)
2024-08-16 13:46:01 -07:00
..
output_processor [core][misc] simply output processing with shortcut code path (#7117) 2024-08-04 00:22:19 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
arg_utils.py [Core] Fix tracking of model forward time in case of PP>1 (#7440) 2024-08-16 13:46:01 -07:00
async_llm_engine.py [Misc] Deprecation Warning when setting --engine-use-ray (#7424) 2024-08-14 09:44:27 -07:00
async_timeout.py [Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654) 2024-06-19 13:57:12 -07:00
llm_engine.py [VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126) 2024-08-14 17:55:42 +00:00
metrics.py [Bugfix] StatLoggers: cache spec decode metrics when they get collected. (#6645) 2024-07-23 23:05:05 +00:00
protocol.py [BugFix] Overhaul async request cancellation (#7111) 2024-08-07 13:21:41 +08:00