Tri Dao
|
a157cc8c9b
|
[FT] Implement MQA/GQA
|
2023-07-22 23:47:01 -07:00 |
|
Tri Dao
|
2800efc71f
|
[FT] rotary_cos/sin should have batch_size dimension
|
2023-07-06 15:33:33 -07:00 |
|
Tri Dao
|
62e9814466
|
[Rotary] Make sure frequency calculation is in fp32
|
2023-07-02 16:39:39 -07:00 |
|
Tri Dao
|
48bc6eacd6
|
[Gen] Add rotary base as an argument to FT attention kernel
|
2023-05-30 13:38:34 -07:00 |
|
Tri Dao
|
be1afaa276
|
[Gen, FT] Use fp32 accum for FMA
|
2023-01-03 22:09:22 -08:00 |
|
Tri Dao
|
f266fc7262
|
[Gen, FT] Use tlength instead of params.timestep for rotary
|
2023-01-03 17:46:55 -08:00 |
|
Tri Dao
|
a01d1213d7
|
[Gen] Add kernel from FasterTransformer for benchmarking
|
2023-01-03 17:37:43 -08:00 |
|