Commit Graph

28 Commits

Author SHA1 Message Date
chuanli11
30fd8c17d8 remove checkout v2.0.0.post1 from dockerfile 2023-07-20 16:40:15 +00:00
Tri Dao
4f285b3547 FlashAttention-2 release 2023-07-17 06:21:34 -07:00
Tri Dao
6d48e14a6c Bump to v1.0.9 2023-07-17 03:16:40 -07:00
Tri Dao
9610114ce8 Bump to v1.0.8 2023-07-02 17:04:54 -07:00
Tri Dao
85b51d61ee Bump version to 1.0.7 2023-05-30 14:18:44 -07:00
Kirthi Shankar Sivamani
dd9c3a1fc2 bump to v1.0.6
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
2023-05-26 17:44:10 -07:00
Tri Dao
eff9fe6b80 Add ninja to pyproject.toml build-system, bump to v1.0.5 2023-05-12 14:20:31 -07:00
Tri Dao
ad113948a6 [Docs] Clearer error message for bwd d > 64, bump to v1.0.4 2023-04-26 09:19:48 -07:00
Tri Dao
fbbb107848 Bump version to v1.0.3.post0 2023-04-21 13:37:23 -07:00
Tri Dao
67ef5d28df Bump version to 1.0.3 2023-04-21 12:04:53 -07:00
Tri Dao
df1344f866 Bump to v1.0.2 2023-04-15 22:19:31 -07:00
Tri Dao
853ff72963 Bump version to v1.0.1, fix Cutlass version 2023-04-12 10:05:01 -07:00
Tri Dao
74af023316 Bump version to 1.0.0 2023-04-11 23:32:35 -07:00
Tri Dao
009a3e71ec [Training] Fix lightning _PATH import 2023-03-29 01:43:39 -07:00
Ikko Eltociear Ashimine
419ea45b64
fix typo in default.yaml
additionaly -> additionally
2023-01-21 00:47:12 +09:00
Tri Dao
33e0860c9c Bump to v0.2.8 2023-01-19 13:17:19 -08:00
Tri Dao
88173a1aaf [FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP 2023-01-17 18:12:27 -08:00
Tri Dao
ce26d3d73d Bump to v0.2.7 2023-01-06 17:37:30 -08:00
Tri Dao
71befc19e1 [Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss 2022-12-31 22:43:28 -08:00
Tri Dao
cadfa396b8 [Docker] Set torchmetrics==0.10.3 2022-12-30 02:42:28 -08:00
Tri Dao
43798966cf [Docs] Fix formatting 2022-12-30 00:01:55 -08:00
Tri Dao
3c7cbfc195 [Docs] Mention that dropout_layer_norm supports all dims up to 6k 2022-12-29 23:55:33 -08:00
Tri Dao
984d5204e2 Update training Dockerfile to use flash-attn==0.2.6 2022-12-29 15:12:33 -08:00
Tri Dao
b4018a5028 Implement Tensor Parallel for GPT model 2022-12-26 16:22:43 -08:00
Tri Dao
dff68c2b22 Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss 2022-12-23 14:51:08 -08:00
Tri Dao
c2407dec96 Fix typo in config: train.gpu -> train.gpu_mem 2022-12-21 13:42:30 -08:00
Tri Dao
4a6eaa9f27 Update configs, add results 2022-11-29 04:46:43 -08:00
Tri Dao
0bf5e50038 Release training code 2022-11-28 17:34:40 -08:00