Edit README: support Turing (SM75)

This commit is contained in:
Tri Dao 2022-06-04 16:06:48 -07:00
parent d380e87fb6
commit f2d8d4104e

View File

@ -9,7 +9,7 @@ Paper: https://arxiv.org/abs/2205.14135
## Alpha release (0.1).
To compile (requiring CUDA 11, NVCC, and an Ampere GPU):
To compile (requiring CUDA 11, NVCC, and an Turing or Ampere GPU):
```
python setup.py install
```
@ -22,7 +22,7 @@ PYTHONPATH=$PWD python benchmarks/benchmark_flash_attention.py
```
FlashAttention currently supports:
1. Ampere GPUs (e.g., A100, RTX 3090).
1. Turing or Ampere GPUs (e.g., A100, RTX 3090, T4, RTX 2080).
2. fp16.
3. Head dimensions 16, 32, 64.
@ -30,7 +30,7 @@ Our tentative roadmap:
1. [Jun 2022] Make package pip-installable.
2. ~~[Jun 2022] Support SM86 GPUs (e.g., RTX 3080, 3090)~~[Done].
3. [Jun 2022] Refactor to use Cutlass.
4. [Jun 2022] Support SM75 GPUs (e.g. T4).
4. ~~[Jun 2022] Support SM75 GPUs (e.g. T4)~~[Done].
5. [Jun 2022] Support bf16.
6. [Jul 2022] Support head dimension 128.
7. [Jul 2022] Support SM70 GPUs (V100).