Commit Graph

11 Commits

Author SHA1 Message Date
Tri Dao
8f4d82cf5e Update cutlass to v3.4.0 2024-01-20 22:30:06 -08:00
Tri Dao
6777336a1c Move masking to a separate file (mask.h) 2024-01-14 12:43:47 -08:00
Tri Dao
9448264ddd Remove seqq_parallel backward kernel that's not used 2024-01-14 12:25:49 -08:00
Tri Dao
a7b66ae25a Simplify writing softmax to gmem 2024-01-13 00:25:04 -08:00
Tri Dao
8d1b169ed1 Simplify SmemLayoutVtransposed in kernel_traits.h 2024-01-12 11:53:29 -08:00
Tri Dao
ccbb14f38e Implement rotary embedding in flash_attn_with_kvcache 2023-09-16 01:20:16 -07:00
Tri Dao
5953c4f58c Remove unused sdPsum in dot_do_o function 2023-09-03 20:44:07 -07:00
Sophia Wisdom
37e32febba
Remove commented out code in bwd (#512)
* Remove lots of comments

* Remove unused traits
2023-09-01 16:43:58 -07:00
Tri Dao
b1fbbd8337 Implement splitKV attention 2023-08-29 00:58:29 -07:00
Tri Dao
dbd7923782 Prepare for Cutlass 3.2 2023-08-13 15:24:32 -07:00
Tri Dao
4f285b3547 FlashAttention-2 release 2023-07-17 06:21:34 -07:00