flash-attention

History

Tri Dao 46fd2a20b2 Support all head dims that are multiples of 8, up to 128		2022-10-24 16:04:21 -07:00
..
gemm.h	Refactor gemm_cl to template on either __half or __nv_bfloat16	2022-07-09 23:18:26 -07:00
gmem_tile.h	Support all head dims that are multiples of 8, up to 128	2022-10-24 16:04:21 -07:00
kernel_traits.h	Split bwd on the seqlen_q dimension	2022-10-23 11:35:15 -07:00
mask.h	Rework dropout to decouple forward and backward	2022-10-21 12:04:27 -07:00
smem_tile.h	Rework dropout to decouple forward and backward	2022-10-21 12:04:27 -07:00
softmax.h	Rework dropout to decouple forward and backward	2022-10-21 12:04:27 -07:00
utils.h	Refactor to template on __half, implement bf16 util functions	2022-07-09 23:18:26 -07:00