vllm/vllm/worker
boydfd 4bb6b67188
fix RAM OOM when load large models in tensor parallel mode. (#1395)
Co-authored-by: ran_lin <rlin@thoughtworks.com>
2023-11-20 19:02:42 -08:00
..
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
cache_engine.py Fix config for Falcon (#1164) 2023-09-23 17:38:43 -07:00
worker.py fix RAM OOM when load large models in tensor parallel mode. (#1395) 2023-11-20 19:02:42 -08:00