ferdinand.mom
|
00ddbd9d2e
|
raise Exception when not enough layers to distributed in rank + rename variable
|
2024-12-03 13:17:52 +00:00 |
|
ferdinand.mom
|
32d8daa880
|
can now load big model through safetensors (sharded and single file)
|
2024-12-01 19:39:16 +00:00 |
|
ferdinand.mom
|
41f49bb15f
|
rename to grad_steps
|
2024-11-04 15:06:29 +00:00 |
|
ferdinand.mom
|
0bfc06506a
|
small changes unrelated to dp+pp sync grad fix
|
2024-11-04 15:00:43 +00:00 |
|
ferdinand.mom
|
7bfdf5f7d1
|
add fuse adam
|
2024-11-04 14:35:36 +00:00 |
|
ferdinand.mom
|
519b506b2b
|
add option to switch between pp engine
|
2024-11-04 14:32:44 +00:00 |
|
ferdinand.mom
|
f74bff79e0
|
cleaning
|
2024-10-30 14:58:41 +00:00 |
|