Commit Graph

6 Commits

Author SHA1 Message Date
Woosuk Kwon
928de46888
Implement PagedAttention V2 (#1348) 2023-10-16 00:59:57 -07:00
Zhuohan Li
96853af5a8
Optimize MQA Kernel (#452) 2023-07-14 20:06:40 -04:00
Woosuk Kwon
e41f06702c
Add support for BLOOM (#331) 2023-07-03 13:12:35 -07:00
Woosuk Kwon
0f4b32199e
Support various block sizes & Change default block size to 16 (#38) 2023-04-15 09:03:24 -07:00
Siyuan (Ryans) Zhuang
21b3671bbc
Basic attention kernel that supports cached KV + (multi-)prompts (#24) 2023-04-04 20:34:46 -07:00
Woosuk Kwon
0deacbce6e
Implement single_query_cached_kv_attention kernel (#3) 2023-03-01 15:02:19 -08:00