Tri Dao
|
36bc29edf7
|
Use int64_t instead of uint32_t in kernel_traits.h
|
2024-01-22 22:39:29 -08:00 |
|
Tri Dao
|
8f4d82cf5e
|
Update cutlass to v3.4.0
|
2024-01-20 22:30:06 -08:00 |
|
Tri Dao
|
6777336a1c
|
Move masking to a separate file (mask.h)
|
2024-01-14 12:43:47 -08:00 |
|
Tri Dao
|
9448264ddd
|
Remove seqq_parallel backward kernel that's not used
|
2024-01-14 12:25:49 -08:00 |
|
Tri Dao
|
a7b66ae25a
|
Simplify writing softmax to gmem
|
2024-01-13 00:25:04 -08:00 |
|
Tri Dao
|
8d1b169ed1
|
Simplify SmemLayoutVtransposed in kernel_traits.h
|
2024-01-12 11:53:29 -08:00 |
|
Tri Dao
|
ccbb14f38e
|
Implement rotary embedding in flash_attn_with_kvcache
|
2023-09-16 01:20:16 -07:00 |
|
Tri Dao
|
5953c4f58c
|
Remove unused sdPsum in dot_do_o function
|
2023-09-03 20:44:07 -07:00 |
|
Sophia Wisdom
|
37e32febba
|
Remove commented out code in bwd (#512)
* Remove lots of comments
* Remove unused traits
|
2023-09-01 16:43:58 -07:00 |
|
Tri Dao
|
b1fbbd8337
|
Implement splitKV attention
|
2023-08-29 00:58:29 -07:00 |
|
Tri Dao
|
dbd7923782
|
Prepare for Cutlass 3.2
|
2023-08-13 15:24:32 -07:00 |
|
Tri Dao
|
4f285b3547
|
FlashAttention-2 release
|
2023-07-17 06:21:34 -07:00 |
|