flash-attention/csrc/layer_norm
2022-11-13 21:59:20 -08:00
..
ln_api.cpp Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln_bwd_kernels.cuh Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln_bwd_semi_cuda_kernel.cu Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln_fwd_cuda_kernel.cu Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln_fwd_kernels.cuh Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln_kernel_traits.h Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln_utils.cuh Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
ln.h Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
README.md Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
setup.py Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00
static_switch.h Add fused_dense and dropout_add_layernorm CUDA extensions 2022-11-13 21:59:20 -08:00

This CUDA extensions implements fused dropout + residual + LayerNorm, based on Apex's FastLayerNorm. We add dropout and residual, and make it work for both pre-norm and post-norm architecture.

cd csrc/layer_norm && pip install .