ferdinand.mom
|
402aa4ccfc
|
small change
|
2024-10-30 14:58:41 +00:00 |
|
zzhhjjj
|
f1f6915ba1
|
1f1b fix
|
2024-10-30 14:58:41 +00:00 |
|
zzhhjjj
|
c7a3fb016a
|
disable grad sync in afab
|
2024-10-30 14:58:40 +00:00 |
|
ferdinand.mom
|
987a7c5c99
|
add todo ring attention
|
2024-10-29 14:18:07 +00:00 |
|
ferdinand.mom
|
46af5b0425
|
some fixes
|
2024-10-29 14:17:42 +00:00 |
|
zzhhjjj
|
b7f3e253be
|
add context parallel
|
2024-10-29 13:42:38 +00:00 |
|
zzhhjjj
|
6220892716
|
refactor
|
2024-10-28 20:44:15 +00:00 |
|
zzhhjjj
|
928ada77b8
|
process group order
|
2024-10-27 04:56:54 +00:00 |
|
zzhhjjj
|
e5cfb5240e
|
match TP loss
|
2024-10-27 02:22:05 +00:00 |
|
zzhhjjj
|
51b5683dd3
|
match tp+pp loss
|
2024-10-27 02:20:18 +00:00 |
|
zzhhjjj
|
ec1e1e5ccf
|
support bf16, all reduce loss
|
2024-10-22 23:38:44 +00:00 |
|
zzhhjjj
|
a6d79b07b5
|
add cuda kernels
|
2024-10-22 22:38:29 +00:00 |
|
zzhhjjj
|
9a7904d5d6
|
revert some change
|
2024-10-22 19:50:23 +00:00 |
|
ferdinand.mom
|
9d53e9afa6
|
use global pgm for ddp
|
2024-10-18 15:51:26 +00:00 |
|
ferdinand.mom
|
2b2781a374
|
made Tensor Parallel API compliant
|
2024-10-18 15:51:26 +00:00 |
|
ferdinand.mom
|
abd1edf9f9
|
all_reduce loss across pp/dp ranks + base_parallel
|
2024-10-18 15:51:17 +00:00 |
|
ferdinand.mom
|
1ebd3de5be
|
Merge DDP + TP from @zzhhjjj
|
2024-10-18 15:05:01 +00:00 |
|
ferdinand.mom
|
d0d6d8994f
|
use global pgm for ddp
|
2024-10-18 14:59:26 +00:00 |
|
ferdinand.mom
|
134d48b658
|
remove merged qkv
|
2024-10-18 14:59:04 +00:00 |
|
zzhhjjj
|
7377238741
|
tesnsor parallel, will clean later
|
2024-10-18 05:13:44 +00:00 |
|
zzhhjjj
|
54ad77e055
|
Merge branch 'main' into ddp-merge
|
2024-10-16 19:13:48 +00:00 |
|
zzhhjjj
|
24ff8d05fd
|
add DDP
|
2024-10-16 16:48:55 +00:00 |
|
zzhhjjj
|
5139a32211
|
repo structure change
|
2024-10-16 16:44:39 +00:00 |
|