picotron/src/parallel
2024-10-18 15:51:17 +00:00
..
data_parallel use global pgm for ddp 2024-10-18 14:59:26 +00:00
tensor_parallel remove merged qkv 2024-10-18 14:59:04 +00:00
context_parallel.py all_reduce loss across pp/dp ranks + base_parallel 2024-10-18 15:51:17 +00:00
pipeline_parallel.py all_reduce loss across pp/dp ranks + base_parallel 2024-10-18 15:51:17 +00:00