zhaoyang-star
|
9090bf02e7
|
Support FP8-E5M2 KV Cache (#2279)
Co-authored-by: zhaoyang <zhao.yang16@zte.com.cn>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-01-28 16:43:54 -08:00 |
|
wbn
|
dacaf5a400
|
Replace head_mapping params with num_kv_heads to attention kernel. (#1997)
Co-authored-by: wangguoya <wangguoya@baidu.com>
Co-authored-by: Yang Zhao <zhaoyangstar@foxmail.com>
|
2023-12-10 10:12:53 -08:00 |
|
Yanming W
|
e0c6f556e8
|
[Build] Avoid building too many extensions (#1624)
|
2023-11-23 16:31:19 -08:00 |
|
Woosuk Kwon
|
928de46888
|
Implement PagedAttention V2 (#1348)
|
2023-10-16 00:59:57 -07:00 |
|