[Docs] Add information about using shared memory in docker (#1845)
This commit is contained in:
parent
a9e4574261
commit
0f621c2c7d
@ -18,7 +18,7 @@ This document provides a high-level guide on integrating a `HuggingFace Transfor
|
|||||||
0. Fork the vLLM repository
|
0. Fork the vLLM repository
|
||||||
--------------------------------
|
--------------------------------
|
||||||
|
|
||||||
Start by forking our `GitHub <https://github.com/vllm-project/vllm/>`_ repository and then :ref:`build it from source <build_from_source>`.
|
Start by forking our `GitHub`_ repository and then :ref:`build it from source <build_from_source>`.
|
||||||
This gives you the ability to modify the codebase and test your model.
|
This gives you the ability to modify the codebase and test your model.
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -11,12 +11,20 @@ The image is available on Docker Hub as `vllm/vllm-openai <https://hub.docker.co
|
|||||||
|
|
||||||
$ docker run --runtime nvidia --gpus all \
|
$ docker run --runtime nvidia --gpus all \
|
||||||
-v ~/.cache/huggingface:/root/.cache/huggingface \
|
-v ~/.cache/huggingface:/root/.cache/huggingface \
|
||||||
-p 8000:8000 \
|
|
||||||
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
|
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
|
||||||
|
-p 8000:8000 \
|
||||||
|
--ipc=host \
|
||||||
vllm/vllm-openai:latest \
|
vllm/vllm-openai:latest \
|
||||||
--model mistralai/Mistral-7B-v0.1
|
--model mistralai/Mistral-7B-v0.1
|
||||||
|
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You can either use the ``ipc=host`` flag or ``--shm-size`` flag to allow the
|
||||||
|
container to access the host's shared memory. vLLM uses PyTorch, which uses shared
|
||||||
|
memory to share data between processes under the hood, particularly for tensor parallel inference.
|
||||||
|
|
||||||
|
|
||||||
You can build and run vLLM from source via the provided dockerfile. To build vLLM:
|
You can build and run vLLM from source via the provided dockerfile. To build vLLM:
|
||||||
|
|
||||||
.. code-block:: console
|
.. code-block:: console
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user