Tri Dao
|
0c04943fa2
|
Require CUDA 11.6+, clean up setup.py
|
2023-09-03 21:24:56 -07:00 |
|
Jeffrey Quesnelle
|
1d817a8ffc
|
fix citation in README (#501)
|
2023-08-29 11:15:33 -07:00 |
|
Tri Dao
|
45ba93cd96
|
Add newlines to README
|
2023-08-24 23:54:13 -07:00 |
|
Tri Dao
|
9e5e8bc91e
|
Change causal mask to be aligned to bottom-right instead of top-left
|
2023-08-24 23:41:07 -07:00 |
|
Tri Dao
|
d30f2e1cd5
|
Bump to v2.0.4
|
2023-08-01 09:01:07 -07:00 |
|
Ian Timmis
|
cbf982afa5
|
README syntax highlighting (#365)
* README syntax highlighting
Adds syntax highlighting to README
* Update README.md
|
2023-07-23 00:21:30 -07:00 |
|
Tri Dao
|
d1a3b52f17
|
Add instruction about limiting number of ninja jobs
|
2023-07-17 23:17:47 -07:00 |
|
Tri Dao
|
b4cc152e97
|
Make sure dout is contiguous
|
2023-07-17 21:54:44 -07:00 |
|
Tri Dao
|
4f285b3547
|
FlashAttention-2 release
|
2023-07-17 06:21:34 -07:00 |
|
Tri Dao
|
ce68305c84
|
Update installation instruction
|
2023-05-25 16:52:52 -07:00 |
|
Tri Dao
|
f0c40b7ddb
|
Recommend Nvidia's Pytorch container
|
2023-05-19 09:41:14 -07:00 |
|
Tri Dao
|
40a25c8ee7
|
Update roadmap
|
2023-05-17 08:32:26 -07:00 |
|
Anthony Hu
|
d63cfc3551
|
Use pyproject.toml to specify build dependencies
|
2023-04-27 11:51:52 +01:00 |
|
Tri Dao
|
74af023316
|
Bump version to 1.0.0
|
2023-04-11 23:32:35 -07:00 |
|
Tri Dao
|
1b18f1b7a1
|
Support H100
|
2023-03-15 14:59:02 -07:00 |
|
Tri Dao
|
f28d61cb2a
|
Update README on requirements (nvcc and Pytorch)
|
2023-03-13 12:48:07 -07:00 |
|
Tri Dao
|
57ee618170
|
Merge pull request #94 from calebthomas259/main
Add a simple tutorial to README.md
|
2023-02-14 19:03:08 -08:00 |
|
Tri Dao
|
2dc2a19589
|
Update roadmap
|
2023-02-09 12:21:30 -08:00 |
|
Caleb Thomas
|
c9a649805b
|
Add a simple tutorial to README.md
|
2022-12-27 14:13:59 +08:00 |
|
Tri Dao
|
4a6eaa9f27
|
Update configs, add results
|
2022-11-29 04:46:43 -08:00 |
|
Tri Dao
|
45bcf37b97
|
[Docs] Capitalize the bibtex citation
|
2022-11-22 02:12:22 -08:00 |
|
Tri Dao
|
4040256b5e
|
Update pip install instructions, bump to 0.2
|
2022-11-15 14:10:48 -08:00 |
|
Tri Dao
|
2e33fc8e36
|
Add GPT and ViT models
|
2022-11-13 22:30:23 -08:00 |
|
Tri Dao
|
3dda4f76de
|
Update README
|
2022-11-13 16:52:40 -08:00 |
|
Tri Dao
|
46fd2a20b2
|
Support all head dims that are multiples of 8, up to 128
|
2022-10-24 16:04:21 -07:00 |
|
Tri Dao
|
2ed471ecc4
|
Add tests for numerical error
|
2022-07-22 17:54:09 -04:00 |
|
Tri Dao
|
42f54d8840
|
Edit mention of Triton implementation
Phil Tillet suggests calling it "experimental".
|
2022-07-11 17:02:29 -07:00 |
|
Tri Dao
|
4577151ff8
|
Link to Triton implementation
|
2022-07-11 16:01:43 -07:00 |
|
Tri Dao
|
d1fc80a3bb
|
Link to IEEE Spectrum article on MLPerf
|
2022-07-10 12:11:46 -07:00 |
|
Tri Dao
|
1bbebccc0a
|
Edit README to mention bf16 support
|
2022-07-09 23:34:29 -07:00 |
|
Tri Dao
|
de19de7ab1
|
Implement for bf16
|
2022-07-09 23:31:56 -07:00 |
|
Tri Dao
|
6c3a8c65af
|
Implement cross attention
|
2022-07-03 17:48:12 -07:00 |
|
Tri Dao
|
450b64fe44
|
Add README section on issues
|
2022-06-27 13:50:16 -07:00 |
|
Dan Fu
|
765741c1ee
|
More explanation
|
2022-06-14 11:55:14 -07:00 |
|
Dan Fu
|
2d5b2483b8
|
Speedup graph for A100, d128
|
2022-06-14 11:54:16 -07:00 |
|
Tri Dao
|
d3e6440958
|
Implement bwd for head dim 128
|
2022-06-11 17:52:36 -07:00 |
|
Dan Fu
|
0a398dfc37
|
Broken link
|
2022-06-04 17:28:45 -07:00 |
|
Dan Fu
|
bd60750e0b
|
T4
|
2022-06-04 17:27:51 -07:00 |
|
Tri Dao
|
f2d8d4104e
|
Edit README: support Turing (SM75)
|
2022-06-04 16:06:48 -07:00 |
|
Dan Fu
|
ad6c694bb3
|
3090 speedup
|
2022-06-01 20:07:00 -07:00 |
|
Tri Dao
|
5a61cb7729
|
Rename src -> flash_attn
|
2022-06-01 18:50:26 -07:00 |
|
Tri Dao
|
c41479d66d
|
Support SM86 GPUs
|
2022-06-01 18:49:47 -07:00 |
|
Dan Fu
|
4b7cfb5f45
|
Citation
|
2022-05-30 13:29:04 -07:00 |
|
Tri Dao
|
a78745189a
|
Add paper arXiv link
|
2022-05-29 18:15:43 -07:00 |
|
Tri Dao
|
d9fff84bd0
|
Edit roadmap
|
2022-05-29 15:44:18 -07:00 |
|
Tri Dao
|
e4ffe5d50e
|
Convert banner figure from pdf to jpg
|
2022-05-29 15:39:17 -07:00 |
|
Tri Dao
|
67c3779598
|
Reorganize directories, add banner figure
|
2022-05-29 15:34:22 -07:00 |
|
Dan Fu
|
7025a092d1
|
Make png images into jpg for dark mode
|
2022-05-28 22:46:49 +01:00 |
|
Dan Fu
|
4decc3c166
|
README typo
|
2022-05-27 22:38:20 +01:00 |
|
Dan Fu
|
dc6d130088
|
Add speedup to README
Update images
Update images
Update description
|
2022-05-27 22:36:56 +01:00 |
|