vllm/docs/source/serving
Nick Hill 99dac099ab
[Core][Doc] Default to multiprocessing for single-node distributed case (#5230)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2024-06-11 11:10:41 -07:00
..
deploying_with_bentoml.rst docs: Add BentoML deployment doc (#3336) 2024-03-12 10:34:30 -07:00
deploying_with_docker.rst [Doc][Build] update after removing vllm-nccl (#5103) 2024-05-29 23:51:18 +00:00
deploying_with_dstack.rst add doc about serving option on dstack (#3074) 2024-05-30 10:11:07 -07:00
deploying_with_kserve.rst docs: Add tutorial on deploying vLLM model with KServe (#2586) 2024-03-01 11:04:14 -08:00
deploying_with_lws.rst Support to serve vLLM on Kubernetes with LWS (#4829) 2024-05-16 16:37:29 -07:00
deploying_with_triton.rst Add documentation to Triton server tutorial (#983) 2023-09-20 10:32:40 -07:00
distributed_serving.rst [Core][Doc] Default to multiprocessing for single-node distributed case (#5230) 2024-06-11 11:10:41 -07:00
env_vars.rst [Doc] add env vars to the doc (#4572) 2024-05-03 05:13:49 +00:00
integrations.rst add doc about serving option on dstack (#3074) 2024-05-30 10:11:07 -07:00
metrics.rst Add Production Metrics in Prometheus format (#1890) 2023-12-02 16:37:44 -08:00
openai_compatible_server.md [Frontend] Add OpenAI Vision API Support (#5237) 2024-06-07 11:23:32 -07:00
run_on_sky.rst [Doc] Update the SkyPilot doc with serving and Llama-3 (#4276) 2024-04-22 15:34:31 -07:00
serving_with_langchain.rst docs: fix langchain (#2736) 2024-02-03 18:17:55 -08:00
usage_stats.md Usage Stats Collection (#2852) 2024-03-28 22:16:12 -07:00