Ying Zhang
|
1c9717d699
|
address comments
|
2024-09-19 22:50:59 -07:00 |
|
Ying Zhang
|
dff976a84a
|
fixes
|
2024-09-16 15:44:44 -07:00 |
|
Ying Zhang
|
7b4e68e04f
|
hopper local attention
|
2024-09-16 14:59:22 -07:00 |
|
Ying Zhang
|
db80387343
|
Add seqused_q in fwd / bwd and seqused_k in bwd.
|
2024-09-16 14:24:11 -07:00 |
|
Cameron Shinn
|
3cea2fb6ee
|
Add ArchTag to pre/postprocess bwd kernels (#1180)
* Add ArchTag to pre/postprocess bwd kernels
* Type-dependent CC check for bwd pre/postprocess
* Fix CC >= 90 for bwd postprocess
---------
Co-authored-by: Cameron Shinn <cshinn@nvidia.com>
|
2024-08-28 00:20:47 -07:00 |
|
Tri Dao
|
bafe253042
|
[FA3] Bwd
|
2024-08-01 01:57:06 -07:00 |
|
Cameron Shinn
|
cb516f855b
|
Remove torchlib dependency from cpp files (#1083)
|
2024-07-22 16:47:09 -07:00 |
|
Tri Dao
|
7f67966cc7
|
FA3 initial code release
|
2024-07-11 09:53:36 -07:00 |
|