vllm/parallel_utils at b4543c8f6bf67a7f1a0d6d0fd6cf5697c7eeaabb - vllm

History

youkaichao c391e4b68e [Core] improve robustness of pynccl (#3860 )		2024-04-04 16:52:12 -07:00
..
__init__.py	TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181 )	2023-10-02 15:36:09 -07:00
communication_op.py	[Core] remove cupy dependency (#3625 )	2024-03-27 00:33:26 -07:00
custom_all_reduce.py	[CI] Try introducing isort. (#3495 )	2024-03-25 07:59:47 -07:00
parallel_state.py	[Core] remove cupy dependency (#3625 )	2024-03-27 00:33:26 -07:00
pynccl_utils.py	[BugFix] Use consistent logger everywhere (#3738 )	2024-03-29 23:26:44 +00:00
pynccl.py	[Core] improve robustness of pynccl (#3860 )	2024-04-04 16:52:12 -07:00
README.md	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
utils.py	TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )	2023-11-15 22:50:41 -08:00

The files in this folder are ported from Megatron-LM. We only keep the codes that are used in inference.