vllm/vllm/engine
2024-07-26 20:54:27 -07:00
..
output_processor Pipeline Parallel: Guard for KeyErrors at request abort (#6587) 2024-07-19 19:18:19 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
arg_utils.py [Bugfix][Model] Jamba assertions and no chunked prefill by default for Jamba (#6784) 2024-07-26 20:45:31 -07:00
async_llm_engine.py [Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125) 2024-07-26 13:50:10 -07:00
async_timeout.py [Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654) 2024-06-19 13:57:12 -07:00
llm_engine.py [Hardware][TPU] Implement tensor parallelism with Ray (#5871) 2024-07-26 20:54:27 -07:00
metrics.py [Bugfix] StatLoggers: cache spec decode metrics when they get collected. (#6645) 2024-07-23 23:05:05 +00:00