vllm/vllm/engine
Roger Wang 26aa325f4f
[Core][VLM] Test registration for OOT multimodal models (#8717)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-04 10:38:25 -07:00
..
multiprocessing [Core] [Frontend] Priority scheduling for embeddings and in the OpenAI-API (#8965) 2024-10-01 09:58:06 +00:00
output_processor [Spec Decode] (1/2) Remove batch expansion (#8839) 2024-10-01 16:04:42 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
arg_utils.py [Core][VLM] Test registration for OOT multimodal models (#8717) 2024-10-04 10:38:25 -07:00
async_llm_engine.py [Core] [Frontend] Priority scheduling for embeddings and in the OpenAI-API (#8965) 2024-10-01 09:58:06 +00:00
async_timeout.py [Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654) 2024-06-19 13:57:12 -07:00
llm_engine.py [Core][VLM] Test registration for OOT multimodal models (#8717) 2024-10-04 10:38:25 -07:00
metrics_types.py [MISC] Add prefix cache hit rate to metrics (#7606) 2024-08-19 11:52:07 -07:00
metrics.py [MISC] Add prefix cache hit rate to metrics (#7606) 2024-08-19 11:52:07 -07:00
protocol.py [Core] [Frontend] Priority scheduling for embeddings and in the OpenAI-API (#8965) 2024-10-01 09:58:06 +00:00