flash-attention/training/configs/experiment
2022-12-31 22:43:28 -08:00
..
owt [Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss 2022-12-31 22:43:28 -08:00
pile [Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss 2022-12-31 22:43:28 -08:00