vllm/docs/source
Sage Moore ce4f5a29fb
Add Automatic Prefix Caching (#2762)
Co-authored-by: ElizaWszola <eliza@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-03-02 00:50:01 -08:00
..
assets/logos Update README.md (#1292) 2023-10-08 23:15:50 -07:00
dev/engine [DOC] Add additional comments for LLMEngine and AsyncLLMEngine (#1011) 2024-01-11 19:26:49 -08:00
getting_started [ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (#2768) 2024-02-10 23:14:37 -08:00
models Add Automatic Prefix Caching (#2762) 2024-03-02 00:50:01 -08:00
quantization [CI] Ensure documentation build is checked in CI (#2842) 2024-02-12 22:53:07 -08:00
serving docs: Add tutorial on deploying vLLM model with KServe (#2586) 2024-03-01 11:04:14 -08:00
conf.py Port metrics from aioprometheus to prometheus_client (#2730) 2024-02-25 11:54:00 -08:00
index.rst docs: Add tutorial on deploying vLLM model with KServe (#2586) 2024-03-01 11:04:14 -08:00