This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
flash-attention
Watch
1
Star
0
Fork
0
You've already forked flash-attention
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
215930bce3
flash-attention
/
csrc
/
flash_attn
/
src
/
fmha
History
Tri Dao
46fd2a20b2
Support all head dims that are multiples of 8, up to 128
2022-10-24 16:04:21 -07:00
..
gemm.h
Refactor gemm_cl to template on either __half or __nv_bfloat16
2022-07-09 23:18:26 -07:00
gmem_tile.h
Support all head dims that are multiples of 8, up to 128
2022-10-24 16:04:21 -07:00
kernel_traits.h
Split bwd on the seqlen_q dimension
2022-10-23 11:35:15 -07:00
mask.h
Rework dropout to decouple forward and backward
2022-10-21 12:04:27 -07:00
smem_tile.h
Rework dropout to decouple forward and backward
2022-10-21 12:04:27 -07:00
softmax.h
Rework dropout to decouple forward and backward
2022-10-21 12:04:27 -07:00
utils.h
Refactor to template on __half, implement bf16 util functions
2022-07-09 23:18:26 -07:00