vllm/source at a19e8d372651abad75dc6a3939c18f23a1ae8d40 - vllm

History

Hongxia Yang 10383887e0 [ROCm] Cleanup Dockerfile and remove outdated patch (#6482 )		2024-07-16 22:47:02 -07:00
..
_templates/sections	[Doc] Guide for adding multi-modal plugins (#6205 )	2024-07-10 14:55:34 +08:00
assets	[Doc] add visualization for multi-stage dockerfile (#4456 )	2024-04-30 17:41:59 +00:00
automatic_prefix_caching	[Doc] Add an automatic prefix caching section in vllm documentation (#5324 )	2024-06-11 10:24:59 -07:00
community	[Docs] Add Google Cloud to sponsor list (#6450 )	2024-07-15 11:58:10 -07:00
dev	[Doc] Guide for adding multi-modal plugins (#6205 )	2024-07-10 14:55:34 +08:00
getting_started	[ROCm] Cleanup Dockerfile and remove outdated patch (#6482 )	2024-07-16 22:47:02 -07:00
models	[Doc] Fix the lora adapter path in server startup script (#6230 )	2024-07-16 10:11:04 -07:00
quantization	[Kernel] Expand FP8 support to Ampere GPUs using FP8 Marlin (#5975 )	2024-07-03 17:38:00 +00:00
serving	[doc][distributed] add suggestion for distributed inference (#6418 )	2024-07-15 09:45:51 -07:00
conf.py	[Docs] Fix readthedocs for tag build (#6158 )	2024-07-05 12:44:40 -07:00
generate_examples.py	Add example scripts to documentation (#4225 )	2024-04-22 16:36:54 +00:00
index.rst	[Doc] Fix Typo in Doc (#6392 )	2024-07-13 00:48:23 +00:00