This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
flash-attention
Watch
1
Star
0
Fork
0
You've already forked flash-attention
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
401
Commits
1
Branch
0
Tags
7.9
MiB
e07aa036db
Commit Graph
3 Commits
Author
SHA1
Message
Date
BoxiangW
e07aa036db
Support flash attention 2 with causal masking when KV's seq length is longer than Q's seq length. (
#436
)
2023-08-24 16:42:34 -07:00
Tri Dao
a4f148b6ab
Fix masking of bwd when seqlen is not divisible by 128
2023-07-31 17:46:34 -07:00
Tri Dao
4f285b3547
FlashAttention-2 release
2023-07-17 06:21:34 -07:00