Commit Graph

7 Commits

Author SHA1 Message Date
ferdinand.mom
00ddbd9d2e raise Exception when not enough layers to distributed in rank + rename variable 2024-12-03 13:17:52 +00:00
ferdinand.mom
32d8daa880 can now load big model through safetensors (sharded and single file) 2024-12-01 19:39:16 +00:00
ferdinand.mom
41f49bb15f rename to grad_steps 2024-11-04 15:06:29 +00:00
ferdinand.mom
0bfc06506a small changes unrelated to dp+pp sync grad fix 2024-11-04 15:00:43 +00:00
ferdinand.mom
7bfdf5f7d1 add fuse adam 2024-11-04 14:35:36 +00:00
ferdinand.mom
519b506b2b add option to switch between pp engine 2024-11-04 14:32:44 +00:00
ferdinand.mom
f74bff79e0 cleaning 2024-10-30 14:58:41 +00:00