flash-attention/csrc/flash_attn/src
Jeremy Reizenstein 0658e320f6
Preprocessor switches to control functionality (#788)
For faster and smaller builds in some simple cases,
provide switches to allow disabling
-backward
-alibi
-uneven k
-dropout
-local attention

Co-authored-by: Jeremy Francis Reizenstein <bottler@users.noreply.github.com>
2024-01-29 20:44:23 -08:00
..
alibi.h Change inline to __forceinline__, use __grid_constant__ param 2024-01-20 17:38:47 -08:00
block_info.h Change inline to __forceinline__, use __grid_constant__ param 2024-01-20 17:38:47 -08:00
dropout.h Refactor masking in fwd pass into 1 object 2024-01-20 17:39:53 -08:00
flash_bwd_hdim32_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim32_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim64_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim64_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim96_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim96_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim128_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim128_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim160_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim160_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim192_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim192_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim224_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim224_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim256_bf16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_hdim256_fp16_sm80.cu Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
flash_bwd_kernel.h Update cutlass to v3.4.0 2024-01-20 22:30:06 -08:00
flash_bwd_launch_template.h Preprocessor switches to control functionality (#788) 2024-01-29 20:44:23 -08:00
flash_bwd_preprocess_kernel.h Move bwd preprocess kernels to a separate file 2024-01-14 16:57:03 -08:00
flash_fwd_hdim32_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim32_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim64_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim64_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim96_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim96_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim128_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim128_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim160_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim160_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim192_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim192_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim224_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim224_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim256_bf16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_hdim256_fp16_sm80.cu Use generate_kernels.py script from Driss Guessous 2023-08-28 13:34:12 -07:00
flash_fwd_kernel.h Implement page KV cache 2024-01-22 22:47:30 -08:00
flash_fwd_launch_template.h Preprocessor switches to control functionality (#788) 2024-01-29 20:44:23 -08:00
flash_fwd_split_hdim32_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim32_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim64_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim64_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim96_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim96_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim128_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim128_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim160_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim160_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim192_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim192_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim224_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim224_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim256_bf16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash_fwd_split_hdim256_fp16_sm80.cu Implement splitKV attention 2023-08-29 00:58:29 -07:00
flash.h Implement page KV cache 2024-01-22 22:47:30 -08:00
generate_kernels.py Remove configure in bwd kernel launch 2024-01-21 15:28:33 -08:00
kernel_traits.h Use int64_t instead of uint32_t in kernel_traits.h 2024-01-22 22:39:29 -08:00
mask.h Refactor masking in fwd pass into 1 object 2024-01-20 17:39:53 -08:00
philox.cuh Change inline to __forceinline__, use __grid_constant__ param 2024-01-20 17:38:47 -08:00
rotary.h Move rotary device functions to a separate file 2024-01-20 18:01:18 -08:00
softmax.h Change inline to __forceinline__, use __grid_constant__ param 2024-01-20 17:38:47 -08:00
static_switch.h Preprocessor switches to control functionality (#788) 2024-01-29 20:44:23 -08:00
utils.h Move rotary device functions to a separate file 2024-01-20 18:01:18 -08:00