Tri Dao
|
92dd5703ec
|
Bump to v2.3.6
|
2023-11-27 16:23:39 -08:00 |
|
Tri Dao
|
23b77c8148
|
Bump to v2.3.5
|
2023-11-26 19:08:28 -08:00 |
|
Tri Dao
|
2c3baba4a6
|
Bump to v2.3.4
|
2023-11-19 23:21:31 -08:00 |
|
Tri Dao
|
83aef842be
|
Bump to v2.3.3
|
2023-10-24 00:24:07 -07:00 |
|
Tri Dao
|
7f31e7c16a
|
Bump to v2.3.2
|
2023-10-08 17:21:29 -07:00 |
|
Tri Dao
|
5e525a8dc8
|
[CI] Use official Pytorch 2.1, add CUDA 11.8 for Pytorch 2.1
|
2023-10-03 22:20:30 -07:00 |
|
Tri Dao
|
21c3b0d8f6
|
Bump to v2.3.1
|
2023-10-03 19:56:45 -07:00 |
|
Tri Dao
|
601b4dc48d
|
Bump to v2.3.0
|
2023-09-26 22:08:29 -07:00 |
|
Tri Dao
|
0a1d03c7ea
|
Bump to v2.2.5
|
2023-09-24 00:54:03 -07:00 |
|
Tri Dao
|
bff3147175
|
Re-enable compilation for Hopper
|
2023-09-21 23:55:25 -07:00 |
|
Tri Dao
|
229080b9d2
|
Bump to v2.2.4
|
2023-09-20 23:39:38 -07:00 |
|
Tri Dao
|
799f56fa90
|
Don't compile for Pytorch 2.1 on CUDA 12.1 due to nvcc segfaults
|
2023-09-17 22:15:38 -07:00 |
|
Tri Dao
|
c984208ddb
|
Set block size to 64 x 64 for kvcache to avoid nvcc segfaults
|
2023-09-17 16:14:58 -07:00 |
|
Tri Dao
|
8c8b4d36e1
|
Bump to v2.2.3
|
2023-09-16 01:47:01 -07:00 |
|
Tri Dao
|
08c295c043
|
Bump to v2.2.2
|
2023-09-10 23:48:12 -07:00 |
|
Tri Dao
|
a1576ad1e8
|
Bump to v2.2.1
|
2023-09-06 02:19:55 -07:00 |
|
Tri Dao
|
6d673cd961
|
Bump to v2.2.0
|
2023-09-05 11:34:13 -07:00 |
|
Tri Dao
|
37c6e05406
|
Implement flash_attn_with_kvcache
|
2023-09-04 00:11:44 -07:00 |
|
Tri Dao
|
4976650f74
|
Set single threaded compilation for CUDA 12.2 so CI doesn't OOM
|
2023-09-03 23:42:55 -07:00 |
|
Tri Dao
|
6a89b2f121
|
Remove constexpr in launch template to fix CI compilation
|
2023-09-03 22:59:41 -07:00 |
|
Tri Dao
|
97ba7a62e9
|
Try switching back to Cutlass 3.2.0
|
2023-09-03 22:45:35 -07:00 |
|
Tri Dao
|
1dc1b6c8f2
|
Bump to v2.1.2
|
2023-09-03 22:23:05 -07:00 |
|
Tri Dao
|
757058d4d3
|
Update Cutlass to v3.2.0
|
2023-08-27 23:47:28 -07:00 |
|
Tri Dao
|
9e5e8bc91e
|
Change causal mask to be aligned to bottom-right instead of top-left
|
2023-08-24 23:41:07 -07:00 |
|
Tri Dao
|
6711b3bc40
|
Bump version to 2.0.9
|
2023-08-22 00:21:14 -07:00 |
|
Tri Dao
|
f1a73d0740
|
Run isort and black on python files
|
2023-08-18 14:22:11 -07:00 |
|
Tri Dao
|
c65b5106ac
|
Fix Bwd NaN for varlen when seqlen_q >> seqlen_k and causal
|
2023-08-16 15:12:36 -07:00 |
|
Tri Dao
|
c60851a825
|
Bump to v2.0.7
|
2023-08-14 14:55:35 -07:00 |
|
Tri Dao
|
f8dccfc90a
|
[CI] Fix MATRIX_CUDA_VERSION check
|
2023-08-14 10:27:26 -07:00 |
|
Tri Dao
|
9c531bdc0a
|
Use single thread compilation for cuda12.1, torch2.1 to avoid OOM CI
|
2023-08-14 10:03:31 -07:00 |
|
Tri Dao
|
67ae6fd74b
|
Bump to v2.0.6
|
2023-08-13 16:52:48 -07:00 |
|
Tri Dao
|
c5e87b11e9
|
Bump to v2.0.5
|
2023-08-13 13:55:04 -07:00 |
|
Tri Dao
|
d30f2e1cd5
|
Bump to v2.0.4
|
2023-08-01 09:01:07 -07:00 |
|
Tri Dao
|
a4e5d1eddd
|
Bump to v2.0.3
|
2023-07-31 17:49:23 -07:00 |
|
Kirthi Shankar Sivamani
|
32a953f486
|
Request for v2.0.2 (#388)
* Bump version to 2.0.2
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
* Update version in Dockerfile
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
---------
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
|
2023-07-28 02:46:03 -07:00 |
|
Tri Dao
|
b252072409
|
Bump to v2.0.1
|
2023-07-23 12:33:42 -10:00 |
|
Tri Dao
|
4f285b3547
|
FlashAttention-2 release
|
2023-07-17 06:21:34 -07:00 |
|
Tri Dao
|
6d48e14a6c
|
Bump to v1.0.9
|
2023-07-17 03:16:40 -07:00 |
|
Tri Dao
|
9610114ce8
|
Bump to v1.0.8
|
2023-07-02 17:04:54 -07:00 |
|
Tri Dao
|
85b51d61ee
|
Bump version to 1.0.7
|
2023-05-30 14:18:44 -07:00 |
|
Kirthi Shankar Sivamani
|
dd9c3a1fc2
|
bump to v1.0.6
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
|
2023-05-26 17:44:10 -07:00 |
|
Max H. Gerlach
|
31f78a9814
|
Allow adding an optional local version to the package version
|
2023-05-19 17:27:41 +02:00 |
|
Gustaf
|
af4a9ce024
|
Add missing __init__.py
|
2022-07-03 02:04:55 -04:00 |
|