vllm/cacheflow/parallel_utils
2023-04-05 11:16:57 -07:00
..
tensor_parallel Add CUDA graph-based all reduce launcher (#26) 2023-04-05 11:16:57 -07:00
__init__.py Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00
parallel_state.py Add CUDA graph-based all reduce launcher (#26) 2023-04-05 11:16:57 -07:00
README.md Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00
utils.py Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00

The files in this folder are ported from Megatron-LM. We only keep the codes that are used in inference.