Siyuan (Ryans) Zhuang
|
21b3671bbc
|
Basic attention kernel that supports cached KV + (multi-)prompts (#24)
|
2023-04-04 20:34:46 -07:00 |
|
Woosuk Kwon
|
897cb2ae28
|
Optimize data movement (#20)
|
2023-04-02 00:30:17 -07:00 |
|
Woosuk Kwon
|
a1b3de86cd
|
Refactor the test code for attention kernels (#13)
|
2023-03-29 18:59:27 -07:00 |
|
Woosuk Kwon
|
3e9f991d6a
|
Use FlashAttention for multi_query_kv_attention (#4)
|
2023-03-01 21:13:08 -08:00 |
|
Woosuk Kwon
|
0deacbce6e
|
Implement single_query_cached_kv_attention kernel (#3)
|
2023-03-01 15:02:19 -08:00 |
|