Woosuk Kwon
|
c9d5b6d4a8
|
Replace FlashAttention with xformers (#70)
|
2023-05-05 02:01:08 -07:00 |
|
Siyuan (Ryans) Zhuang
|
e3cec88aa5
|
Memcpy kernel for flash attention (#29)
* optimize
* add benchmark
* add assert
* add test
|
2023-04-10 18:22:49 -07:00 |
|
Woosuk Kwon
|
0f40557af6
|
Implement block copy kernel to optimize beam search (#32)
|
2023-04-07 17:45:07 -07:00 |
|
Woosuk Kwon
|
897cb2ae28
|
Optimize data movement (#20)
|
2023-04-02 00:30:17 -07:00 |
|
Woosuk Kwon
|
0deacbce6e
|
Implement single_query_cached_kv_attention kernel (#3)
|
2023-03-01 15:02:19 -08:00 |
|