This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
1
Packages
Projects
Releases
Wiki
Activity
a57d75821c
vllm
/
vllm
/
engine
History
Woosuk Kwon
52f07e3dec
[Hardware][TPU] Implement tensor parallelism with Ray (
#5871
)
2024-07-26 20:54:27 -07:00
..
output_processor
Pipeline Parallel: Guard for KeyErrors at request abort (
#6587
)
2024-07-19 19:18:19 -07:00
__init__.py
Change the name to vLLM (
#150
)
2023-06-17 03:07:40 -07:00
arg_utils.py
[Bugfix][Model] Jamba assertions and no chunked prefill by default for Jamba (
#6784
)
2024-07-26 20:45:31 -07:00
async_llm_engine.py
[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (
#6125
)
2024-07-26 13:50:10 -07:00
async_timeout.py
[Bugfix] AsyncLLMEngine hangs with asyncio.run (
#5654
)
2024-06-19 13:57:12 -07:00
llm_engine.py
[Hardware][TPU] Implement tensor parallelism with Ray (
#5871
)
2024-07-26 20:54:27 -07:00
metrics.py
[Bugfix] StatLoggers: cache spec decode metrics when they get collected. (
#6645
)
2024-07-23 23:05:05 +00:00