flash-attention/training/configs/experiment/owt
2023-01-17 18:12:27 -08:00
..
base.yaml [Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss 2022-12-31 22:43:28 -08:00
gpt2l-flash.yaml Update configs, add results 2022-11-29 04:46:43 -08:00
gpt2l-hf.yaml Release training code 2022-11-28 17:34:40 -08:00
gpt2l.yaml Release training code 2022-11-28 17:34:40 -08:00
gpt2m-flash.yaml Update configs, add results 2022-11-29 04:46:43 -08:00
gpt2m-hf.yaml Release training code 2022-11-28 17:34:40 -08:00
gpt2m.yaml Release training code 2022-11-28 17:34:40 -08:00
gpt2s-flash.yaml [FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP 2023-01-17 18:12:27 -08:00
gpt2s-hf.yaml Release training code 2022-11-28 17:34:40 -08:00
gpt2s.yaml Release training code 2022-11-28 17:34:40 -08:00
gpt2xl-flash.yaml Update configs, add results 2022-11-29 04:46:43 -08:00
gpt2xl-hf.yaml Update configs, add results 2022-11-29 04:46:43 -08:00
gpt2xl.yaml Release training code 2022-11-28 17:34:40 -08:00