Tri Dao
|
3f7d5786ba
|
Pass alibi slopes to flash_attn_with_kvcache during generation
|
2023-12-24 20:31:59 -08:00 |
|
Tri Dao
|
2c7d7b7396
|
Implement norm head for Baichuan2
|
2023-12-22 16:55:40 -08:00 |
|
Tri Dao
|
c3b2196652
|
Add Alibi to MHA, test with Baichuan-13B
|
2023-12-21 22:49:55 -08:00 |
|
Tri Dao
|
dfe29f5e2b
|
[Gen] Don't use ft_attention, use flash_attn_with_kvcache instead
|
2023-09-18 15:29:06 -07:00 |
|
Tri Dao
|
8a733cbd53
|
[Gen] Fix calling update_graph_cache in tests
|
2023-09-10 17:22:37 -07:00 |
|
Tri Dao
|
a86442f0f3
|
[Gen] Use flash_attn_with_kvcache in generation
|
2023-09-07 08:24:43 -07:00 |
|
Tri Dao
|
9795159082
|
[Rotary] Set device before launching Triton kernel to avoid error
|
2023-09-05 21:29:03 -07:00 |
|
Tri Dao
|
913922cac5
|
[Gen] Refactor decoding function
|
2023-09-04 17:01:38 -07:00 |
|
Tri Dao
|
798858f9f1
|
Fix test_baichuan
|
2023-09-03 21:01:37 -07:00 |
|
GAOXinyu
|
a8c35b4f57
|
FEAT: add codes which supporting for baichuan-inc/Baichuan-7B (#425)
|
2023-08-21 11:05:06 -07:00 |
|