vllm/worker at 7d761fe3c12e87df37383467c43c97dec2bb8470 - vllm

History

boydfd 4bb6b67188 fix RAM OOM when load large models in tensor parallel mode. (#1395 ) Co-authored-by: ran_lin <rlin@thoughtworks.com>		2023-11-20 19:02:42 -08:00
..
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
cache_engine.py	Fix config for Falcon (#1164 )	2023-09-23 17:38:43 -07:00
worker.py	fix RAM OOM when load large models in tensor parallel mode. (#1395 )	2023-11-20 19:02:42 -08:00