Commit Graph

7 Commits

Author SHA1 Message Date
Tri Dao
393882bc08 [LayerNorm] Implement LN with parallel residual, support dim 8k 2023-03-31 14:23:45 -07:00
Tri Dao
6738d9477d [LayerNorm] Implement RMS Norm 2023-01-06 17:34:22 -08:00
Tri Dao
8c6609ae1a [LayerNorm] Support all dimensions up to 6k (if divisible by 8) 2022-12-09 02:06:22 -08:00
Tri Dao
0bf5e50038 Release training code 2022-11-28 17:34:40 -08:00
Tri Dao
43ab0b5205 Mention that some CUDA extensions have only been tested on A100s 2022-11-15 07:10:25 -08:00
Tri Dao
2e33fc8e36 Add GPT and ViT models 2022-11-13 22:30:23 -08:00
Tri Dao
fa6d1ce44f Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00