Commit Graph

62 Commits

Author SHA1 Message Date
Tri Dao
2406f28805 Enable headdim 256 backward on consumer GPUs (Ampere, Ada) 2024-02-21 15:56:19 -08:00
Tao He
204c3c6d1b
Fixes an error in comment (#785)
Signed-off-by: Tao He <sighingnow@gmail.com>
2024-01-23 12:38:29 -08:00
Tri Dao
54e80a3829 Implement page KV cache
Co-authored-by: ljss <450993438@qq.com>
2024-01-22 22:47:30 -08:00
Erich Schubert
99ea4baa1d
Typo in README (#760) 2024-01-08 09:59:00 -08:00
Tri Dao
732654583c Implement deterministic backward (thanks to Meituan) 2023-12-23 17:57:36 -08:00
Tri Dao
50d144c906 Mention Alibi in README 2023-12-21 23:48:16 -08:00
Tri Dao
7f31e7c16a Bump to v2.3.2 2023-10-08 17:21:29 -07:00
Tri Dao
5a83425442 Change constexpr int to constexpr static int 2023-10-08 16:26:33 -07:00
Tri Dao
3a9fe7b0fa Add change log 2023-10-05 14:19:08 -07:00
Tri Dao
aa4fd2d166 Clarify that Windows is not supported right now 2023-10-05 14:00:45 -07:00
Tri Dao
0c04943fa2 Require CUDA 11.6+, clean up setup.py 2023-09-03 21:24:56 -07:00
Jeffrey Quesnelle
1d817a8ffc
fix citation in README (#501) 2023-08-29 11:15:33 -07:00
Tri Dao
45ba93cd96 Add newlines to README 2023-08-24 23:54:13 -07:00
Tri Dao
9e5e8bc91e Change causal mask to be aligned to bottom-right instead of top-left 2023-08-24 23:41:07 -07:00
Tri Dao
d30f2e1cd5 Bump to v2.0.4 2023-08-01 09:01:07 -07:00
Ian Timmis
cbf982afa5
README syntax highlighting (#365)
* README syntax highlighting

Adds syntax highlighting to README

* Update README.md
2023-07-23 00:21:30 -07:00
Tri Dao
d1a3b52f17 Add instruction about limiting number of ninja jobs 2023-07-17 23:17:47 -07:00
Tri Dao
b4cc152e97 Make sure dout is contiguous 2023-07-17 21:54:44 -07:00
Tri Dao
4f285b3547 FlashAttention-2 release 2023-07-17 06:21:34 -07:00
Tri Dao
ce68305c84 Update installation instruction 2023-05-25 16:52:52 -07:00
Tri Dao
f0c40b7ddb Recommend Nvidia's Pytorch container 2023-05-19 09:41:14 -07:00
Tri Dao
40a25c8ee7 Update roadmap 2023-05-17 08:32:26 -07:00
Anthony Hu
d63cfc3551 Use pyproject.toml to specify build dependencies 2023-04-27 11:51:52 +01:00
Tri Dao
74af023316 Bump version to 1.0.0 2023-04-11 23:32:35 -07:00
Tri Dao
1b18f1b7a1 Support H100 2023-03-15 14:59:02 -07:00
Tri Dao
f28d61cb2a Update README on requirements (nvcc and Pytorch) 2023-03-13 12:48:07 -07:00
Tri Dao
57ee618170
Merge pull request #94 from calebthomas259/main
Add a simple tutorial to README.md
2023-02-14 19:03:08 -08:00
Tri Dao
2dc2a19589 Update roadmap 2023-02-09 12:21:30 -08:00
Caleb Thomas
c9a649805b Add a simple tutorial to README.md 2022-12-27 14:13:59 +08:00
Tri Dao
4a6eaa9f27 Update configs, add results 2022-11-29 04:46:43 -08:00
Tri Dao
45bcf37b97 [Docs] Capitalize the bibtex citation 2022-11-22 02:12:22 -08:00
Tri Dao
4040256b5e Update pip install instructions, bump to 0.2 2022-11-15 14:10:48 -08:00
Tri Dao
2e33fc8e36 Add GPT and ViT models 2022-11-13 22:30:23 -08:00
Tri Dao
3dda4f76de Update README 2022-11-13 16:52:40 -08:00
Tri Dao
46fd2a20b2 Support all head dims that are multiples of 8, up to 128 2022-10-24 16:04:21 -07:00
Tri Dao
2ed471ecc4 Add tests for numerical error 2022-07-22 17:54:09 -04:00
Tri Dao
42f54d8840 Edit mention of Triton implementation
Phil Tillet suggests calling it "experimental".
2022-07-11 17:02:29 -07:00
Tri Dao
4577151ff8 Link to Triton implementation 2022-07-11 16:01:43 -07:00
Tri Dao
d1fc80a3bb Link to IEEE Spectrum article on MLPerf 2022-07-10 12:11:46 -07:00
Tri Dao
1bbebccc0a Edit README to mention bf16 support 2022-07-09 23:34:29 -07:00
Tri Dao
de19de7ab1 Implement for bf16 2022-07-09 23:31:56 -07:00
Tri Dao
6c3a8c65af Implement cross attention 2022-07-03 17:48:12 -07:00
Tri Dao
450b64fe44 Add README section on issues 2022-06-27 13:50:16 -07:00
Dan Fu
765741c1ee More explanation 2022-06-14 11:55:14 -07:00
Dan Fu
2d5b2483b8 Speedup graph for A100, d128 2022-06-14 11:54:16 -07:00
Tri Dao
d3e6440958 Implement bwd for head dim 128 2022-06-11 17:52:36 -07:00
Dan Fu
0a398dfc37 Broken link 2022-06-04 17:28:45 -07:00
Dan Fu
bd60750e0b T4 2022-06-04 17:27:51 -07:00
Tri Dao
f2d8d4104e Edit README: support Turing (SM75) 2022-06-04 16:06:48 -07:00
Dan Fu
ad6c694bb3 3090 speedup 2022-06-01 20:07:00 -07:00