From 4050d646e5221a516c93384b047e10b22d7167e7 Mon Sep 17 00:00:00 2001 From: youkaichao Date: Mon, 1 Jul 2024 09:52:43 -0700 Subject: [PATCH] [doc][misc] remove deprecated api server in doc (#6037) --- docs/source/serving/distributed_serving.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/serving/distributed_serving.rst b/docs/source/serving/distributed_serving.rst index 2a7937a9..91f64ad2 100644 --- a/docs/source/serving/distributed_serving.rst +++ b/docs/source/serving/distributed_serving.rst @@ -19,7 +19,7 @@ To run multi-GPU serving, pass in the :code:`--tensor-parallel-size` argument wh .. code-block:: console - $ python -m vllm.entrypoints.api_server \ + $ python -m vllm.entrypoints.openai.api_server \ $ --model facebook/opt-13b \ $ --tensor-parallel-size 4