| .. |
|
compatibility_matrix.rst
|
[Misc] Consolidate ModelConfig code related to HF config (#10104)
|
2024-11-07 06:00:21 +00:00 |
|
deploying_with_bentoml.rst
|
docs: Add BentoML deployment doc (#3336)
|
2024-03-12 10:34:30 -07:00 |
|
deploying_with_cerebrium.rst
|
[DOC] - Add docker image to Cerebrium Integration (#6510)
|
2024-07-17 10:22:53 -07:00 |
|
deploying_with_docker.rst
|
[Doc] Update docker references (#5614)
|
2024-06-19 15:01:45 -07:00 |
|
deploying_with_dstack.rst
|
[Doc][CI/Build] Update docs and tests to use vllm serve (#6431)
|
2024-07-17 07:43:21 +00:00 |
|
deploying_with_k8s.rst
|
[Doc]: Add deploying_with_k8s guide (#8451)
|
2024-10-07 13:31:45 -07:00 |
|
deploying_with_kserve.rst
|
Update link to KServe deployment guide (#9173)
|
2024-10-09 03:58:49 +00:00 |
|
deploying_with_lws.rst
|
Support to serve vLLM on Kubernetes with LWS (#4829)
|
2024-05-16 16:37:29 -07:00 |
|
deploying_with_nginx.rst
|
[Hardware][Intel CPU][DOC] Update docs for CPU backend (#6212)
|
2024-10-22 10:38:04 -07:00 |
|
deploying_with_triton.rst
|
Add documentation to Triton server tutorial (#983)
|
2023-09-20 10:32:40 -07:00 |
|
distributed_serving.rst
|
[doc] update pp support (#9853)
|
2024-10-30 13:36:51 -07:00 |
|
env_vars.rst
|
[doc][misc] add note for Kubernetes users (#5916)
|
2024-06-27 10:07:07 -07:00 |
|
faq.rst
|
[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM (#7962)
|
2024-09-05 16:25:29 -04:00 |
|
integrations.rst
|
llama_index serving integration documentation (#6973)
|
2024-08-14 15:38:37 -07:00 |
|
metrics.rst
|
Add Production Metrics in Prometheus format (#1890)
|
2023-12-02 16:37:44 -08:00 |
|
openai_compatible_server.md
|
[Frontend] Tool calling parser for Granite 3.0 models (#9027)
|
2024-11-07 07:09:02 -08:00 |
|
run_on_sky.rst
|
[Misc][OpenAI] deprecate max_tokens in favor of new max_completion_tokens field for chat completion endpoint (#9837)
|
2024-10-30 18:15:56 -07:00 |
|
serving_with_langchain.rst
|
docs: fix langchain (#2736)
|
2024-02-03 18:17:55 -08:00 |
|
serving_with_llamaindex.rst
|
llama_index serving integration documentation (#6973)
|
2024-08-14 15:38:37 -07:00 |
|
tensorizer.rst
|
[Doc]: Update tensorizer docs to include vllm[tensorizer] (#7889)
|
2024-10-22 15:43:25 -07:00 |
|
usage_stats.md
|
Usage Stats Collection (#2852)
|
2024-03-28 22:16:12 -07:00 |