Update roadmap

This commit is contained in:
Tri Dao 2023-02-09 12:21:16 -08:00
parent 06da275bcb
commit 2dc2a19589

View File

@ -65,14 +65,15 @@ FlashAttention currently supports:
Our tentative roadmap: Our tentative roadmap:
1. ~~[Jun 2022] Make package pip-installable~~[Done, thanks to lucidrains]. 1. ~~[Jun 2022] Make package pip-installable~~[Done, thanks to lucidrains].
2. ~~[Jun 2022] Support SM86 GPUs (e.g., RTX 3080, 3090)~~[Done]. 2. ~~[Jun 2022] Support SM86 GPUs (e.g., RTX 3080, 3090)~~[Done].
3. [Jun 2022] Refactor to use Cutlass. 3. ~~[Jun 2022] Support SM75 GPUs (e.g. T4)~~[Done].
4. ~~[Jun 2022] Support SM75 GPUs (e.g. T4)~~[Done]. 4. ~~[Jun 2022] Support bf16~~[Done].
5. ~~[Jun 2022] Support bf16~~[Done]. 5. ~~[Jul 2022] Implement cross-attention~~[Done].
6. ~~[Jul 2022] Implement cross-attention~~[Done]. 6. ~~[Jul 2022] Support head dimension 128~~[Done].
7. ~~[Jul 2022] Support head dimension 128~~[Done]. 7. ~~[Aug 2022] Fuse rotary embedding~~[Done].
8. [Jul 2022] Support SM70 GPUs (V100). 8. [Apr 2023] Refactor to use Cutlass 3.x.
9. ~~[Aug 2022] Fuse rotary embedding~~[Done]. 9. [May 2023] Support attention bias (e.g. ALiBi, relative positional encoding).
10. [Aug 2022] Support attention bias (e.g. ALiBi, relative positional encoding). 10. [Jun 2023] Support SM70 GPUs (V100).
11. [Jun 2023] Support SM90 GPUs (H100).
## Speedup and Memory Savings ## Speedup and Memory Savings