flash-attention/csrc/flash_attn/src/fmha
2022-10-24 16:04:21 -07:00
..
gemm.h Refactor gemm_cl to template on either __half or __nv_bfloat16 2022-07-09 23:18:26 -07:00
gmem_tile.h Support all head dims that are multiples of 8, up to 128 2022-10-24 16:04:21 -07:00
kernel_traits.h Split bwd on the seqlen_q dimension 2022-10-23 11:35:15 -07:00
mask.h Rework dropout to decouple forward and backward 2022-10-21 12:04:27 -07:00
smem_tile.h Rework dropout to decouple forward and backward 2022-10-21 12:04:27 -07:00
softmax.h Rework dropout to decouple forward and backward 2022-10-21 12:04:27 -07:00
utils.h Refactor to template on __half, implement bf16 util functions 2022-07-09 23:18:26 -07:00