vllm/source at dac6a3f6ed14ea4061b672f9290bfdf8bcdd996d - vllm

History

Simon Mo 51d4094fda chunked-prefill-doc-syntax (#4603 ) Fix the docs: https://docs.vllm.ai/en/latest/models/performance.html Co-authored-by: sang <rkooo567@gmail.com>		2024-05-10 14:13:23 +09:00
..
assets	[Doc] add visualization for multi-stage dockerfile (#4456 )	2024-04-30 17:41:59 +00:00
dev	[Doc] add visualization for multi-stage dockerfile (#4456 )	2024-04-30 17:41:59 +00:00
getting_started	Unable to find Punica extension issue during source code installation (#4494 )	2024-05-01 00:42:09 +00:00
models	chunked-prefill-doc-syntax (#4603 )	2024-05-10 14:13:23 +09:00
quantization	Enable scaled FP8 (e4m3fn) KV cache on ROCm (AMD GPU) (#3290 )	2024-04-03 14:15:55 -07:00
serving	[Bugfix] Fix CLI arguments in OpenAI server docs (#4709 )	2024-05-09 09:53:14 -07:00
conf.py	[CI] Disable non-lazy string operation on logging (#4326 )	2024-04-26 00:16:58 -07:00
generate_examples.py	Add example scripts to documentation (#4225 )	2024-04-22 16:36:54 +00:00
index.rst	[Doc] Chunked Prefill Documentation (#4580 )	2024-05-04 00:18:00 -07:00