Tri Dao
|
c4b9015d74
|
Add benchmark_gemm.py
|
2024-07-27 11:13:18 -07:00 |
|
Tri Dao
|
ffc8682dd5
|
Add benchmarking code for Alibi (from Sanghun Cho)
|
2024-01-23 19:00:49 -08:00 |
|
Tri Dao
|
b4bf9cc1f3
|
Fix performance regression with causal
|
2023-11-26 19:07:25 -08:00 |
|
Aman Gupta Karmani
|
b4b6e90334
|
add benchmark for xformers fa2 wrapper (#492)
|
2023-08-25 14:10:05 -07:00 |
|
Tri Dao
|
9e5e8bc91e
|
Change causal mask to be aligned to bottom-right instead of top-left
|
2023-08-24 23:41:07 -07:00 |
|
Tri Dao
|
60499abcfd
|
[Benchmark] Add script to benchmark FlashAttention
|
2023-07-28 00:26:52 -10:00 |
|
Tri Dao
|
4f285b3547
|
FlashAttention-2 release
|
2023-07-17 06:21:34 -07:00 |
|
Tri Dao
|
4360cfc6a8
|
[Triton] Fix benchmark_causal.py
|
2023-03-22 01:34:38 -07:00 |
|
Tri Dao
|
5d079fdd7a
|
[Triton] Fix benchmark_causal, mention Triton version
|
2023-03-22 00:51:16 -07:00 |
|
Tri Dao
|
b0c0db81f6
|
Implement FlashAttention in Triton
|
2022-10-30 18:09:11 -07:00 |
|
Tri Dao
|
ed553e9238
|
Add Megatron attention implementation for benchmarking
|
2022-10-23 23:04:16 -07:00 |
|
Tri Dao
|
50ca23488d
|
Add Triton implementation for benchmarking
|
2022-10-23 17:25:56 -07:00 |
|
Tri Dao
|
fb88e5e4b3
|
Move benchmark utils, support AMP
|
2022-10-23 12:50:00 -07:00 |
|
Tri Dao
|
6c3a8c65af
|
Implement cross attention
|
2022-07-03 17:48:12 -07:00 |
|
Tri Dao
|
5a61cb7729
|
Rename src -> flash_attn
|
2022-06-01 18:50:26 -07:00 |
|
Tri Dao
|
67c3779598
|
Reorganize directories, add banner figure
|
2022-05-29 15:34:22 -07:00 |
|
Tri Dao
|
9dbc491aa5
|
Rename, add benchmarking script
|
2022-05-26 13:57:38 -07:00 |
|