Commit Graph

18 Commits

Author SHA1 Message Date
youkaichao
2a85f93007
[Core][Distributed] enable multiple tp group (#4512)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2024-05-02 04:28:21 +00:00
youkaichao
91f50a6fe2
[Core][Distributed] use cpu/gloo to initialize pynccl (#4248) 2024-04-23 18:32:19 -07:00
youkaichao
6dc1fc9cfe
[Core] nccl integrity check and test (#4155)
[Core] Add integrity check during initialization; add test for it (#4155)
2024-04-17 22:28:52 -07:00
youkaichao
8438e0569e
[Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024)
[Core] replace narrow-usage RayWorkerVllm to general WorkerWrapper to reduce code duplication (#4024)
2024-04-17 08:34:33 +00:00
SangBin Cho
67b4221a61
[Core][5/N] Fully working chunked prefill e2e (#3884) 2024-04-10 17:56:48 -07:00
youkaichao
63e7176f26
[Core][Refactor] move parallel_utils into vllm/distributed (#3950)
[WIP][Core][Refactor] move vllm/model_executor/parallel_utils into vllm/distributed and vllm/device_communicators (#3950)
2024-04-10 15:33:30 -07:00
youkaichao
756b30a5f3
[Core][Test] move local_rank to the last arg with default value(#3711)
[Core][Test] move local_rank to the last arg with default value to keep api compatible (#3711)
2024-03-28 21:19:45 -07:00
SangBin Cho
26422e477b
[Test] Make model tests run again and remove --forked from pytest (#3631)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-03-28 21:06:40 -07:00
Roy
515386ef3c
[Core] Support multi-node inference(eager and cuda graph) (#3686) 2024-03-28 15:01:55 -07:00
youkaichao
8f44facddd
[Core] remove cupy dependency (#3625) 2024-03-27 00:33:26 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
Zhuohan Li
a61f0521b8
[Test] Add basic correctness test (#2908) 2024-02-18 16:44:50 -08:00
Hanzhi Zhou
380170038e
Implement custom all reduce kernels (#2192) 2024-01-27 12:46:35 -08:00
Zhuohan Li
ef9b636e2d
Simplify broadcast logic for control messages (#2501) 2024-01-19 11:23:30 -08:00
Simon Mo
6e01e8c1c8
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
Zhuohan Li
358c328d69
[BUGFIX] Fix communication test (#2285) 2023-12-27 17:18:11 -05:00
Zhuohan Li
20d0699d49
[Fix] Fix comm test (#1691) 2023-11-16 16:28:39 -08:00
Zhuohan Li
ba0bfd40e2
TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00