vllm/engine at a57d75821c6177da75fdebf171d528eef5301961 - vllm

History

Woosuk Kwon 52f07e3dec [Hardware][TPU] Implement tensor parallelism with Ray (#5871 )		2024-07-26 20:54:27 -07:00
..
output_processor	Pipeline Parallel: Guard for KeyErrors at request abort (#6587 )	2024-07-19 19:18:19 -07:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
arg_utils.py	[Bugfix][Model] Jamba assertions and no chunked prefill by default for Jamba (#6784 )	2024-07-26 20:45:31 -07:00
async_llm_engine.py	[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125 )	2024-07-26 13:50:10 -07:00
async_timeout.py	[Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654 )	2024-06-19 13:57:12 -07:00
llm_engine.py	[Hardware][TPU] Implement tensor parallelism with Ray (#5871 )	2024-07-26 20:54:27 -07:00
metrics.py	[Bugfix] StatLoggers: cache spec decode metrics when they get collected. (#6645 )	2024-07-23 23:05:05 +00:00