Commit Graph

5 Commits

Author SHA1 Message Date
Tri Dao
71befc19e1 [Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss 2022-12-31 22:43:28 -08:00
Tri Dao
dff68c2b22 Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss 2022-12-23 14:51:08 -08:00
Tri Dao
c2407dec96 Fix typo in config: train.gpu -> train.gpu_mem 2022-12-21 13:42:30 -08:00
Tri Dao
4a6eaa9f27 Update configs, add results 2022-11-29 04:46:43 -08:00
Tri Dao
0bf5e50038 Release training code 2022-11-28 17:34:40 -08:00