Commit Graph

10 Commits

Author SHA1 Message Date
Joel Lamy-Poirier
767b71ccf0
Fix random state for dropout_layer_norm (#315) 2023-07-23 15:05:13 -07:00
Ikko Eltociear Ashimine
dfc60f6b7d
[LayerNorm] Fix typo in ln_api.cpp
unintialized -> uninitialized
2023-07-20 01:16:16 +09:00
Tri Dao
393882bc08 [LayerNorm] Implement LN with parallel residual, support dim 8k 2023-03-31 14:23:45 -07:00
Tri Dao
eb33e587e9 [LayerNorm] Rename x1 -> residual 2023-01-19 13:07:27 -08:00
Tri Dao
6738d9477d [LayerNorm] Implement RMS Norm 2023-01-06 17:34:22 -08:00
Tri Dao
a8cfe51551 Implement Tensor Parallel for transformer Block 2022-12-25 14:08:21 -08:00
Tri Dao
5db330519a [LayerNorm] Support taking subset of input or subset of output 2022-12-12 22:16:14 -08:00
Tri Dao
ae137ed17a [LayerNorm] Fuse LayerScale 2022-12-10 23:28:23 -08:00
Tri Dao
8c6609ae1a [LayerNorm] Support all dimensions up to 6k (if divisible by 8) 2022-12-09 02:06:22 -08:00
Tri Dao
fa6d1ce44f Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00