Commit Graph

6 Commits

Author SHA1 Message Date
Tri Dao
0bf5e50038 Release training code 2022-11-28 17:34:40 -08:00
Tri Dao
39ed597b28 [LayerNorm] Compile for both sm70 and sm80 2022-11-17 11:45:11 -08:00
Tri Dao
43ab0b5205 Mention that some CUDA extensions have only been tested on A100s 2022-11-15 07:10:25 -08:00
Tri Dao
e4d3013e15 [LayerNorm] Check cuda error after querying ctas_per_sm 2022-11-15 07:05:13 -08:00
Tri Dao
2e33fc8e36 Add GPT and ViT models 2022-11-13 22:30:23 -08:00
Tri Dao
fa6d1ce44f Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00