Use --ipc=host in docker run for distributed inference (#1125)
This commit is contained in:
parent
f98b745a81
commit
7d7e3b78a3
@ -46,4 +46,5 @@ You can also build and install vLLM from source:
|
|||||||
.. code-block:: console
|
.. code-block:: console
|
||||||
|
|
||||||
$ # Pull the Docker image with CUDA 11.8.
|
$ # Pull the Docker image with CUDA 11.8.
|
||||||
$ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3
|
$ # Use `--ipc=host` to make sure the shared memory is large enough.
|
||||||
|
$ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user