vllm/source at 48a8f4a7fd18d516ffc0a304219ef722613ea792 - vllm

History

张大成 48a8f4a7fd Support Orion model (#2539 ) Co-authored-by: zhangdacheng <zhangdacheng@ainirobot.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>		2024-02-26 19:17:06 -08:00
..
assets/logos	Update README.md (#1292 )	2023-10-08 23:15:50 -07:00
dev/engine	[DOC] Add additional comments for LLMEngine and AsyncLLMEngine (#1011 )	2024-01-11 19:26:49 -08:00
getting_started	[ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (#2768 )	2024-02-10 23:14:37 -08:00
models	Support Orion model (#2539 )	2024-02-26 19:17:06 -08:00
quantization	[CI] Ensure documentation build is checked in CI (#2842 )	2024-02-12 22:53:07 -08:00
serving	docs: fix langchain (#2736 )	2024-02-03 18:17:55 -08:00
conf.py	Port metrics from `aioprometheus` to `prometheus_client` (#2730 )	2024-02-25 11:54:00 -08:00
index.rst	[CI] Ensure documentation build is checked in CI (#2842 )	2024-02-12 22:53:07 -08:00