flash-attention

Author	SHA1	Message	Date
Tri Dao	2ddeaa406c	Fix wheel building	2023-08-13 16:48:47 -07:00
Tri Dao	3c458cff77	Merge branch 'feature/demo-wheels' of https://github.com/piercefreeman/flash-attention into piercefreeman-feature/demo-wheels * 'feature/demo-wheels' of https://github.com/piercefreeman/flash-attention: (25 commits) Install standard non-wheel package Remove release creation Build wheel on each push Isolate 2.0.0 & cuda12 Clean setup.py imports Remove builder project Bump version Add notes to github action workflow Add torch dependency to final build Exclude cuda erroring builds Exclude additional disallowed matrix params Full version matrix Add CUDA 11.7 Release is actually unsupported echo OS version Temp disable deploy OS version build numbers Restore full build matrix Refactor and clean of setup.py Strip cuda name from torch version ...	2023-08-13 16:03:51 -07:00
Tri Dao	1c41d2b0e5	Fix race condition in bwd (overwriting sK)	2023-08-01 09:00:10 -07:00
Tri Dao	4f285b3547	FlashAttention-2 release	2023-07-17 06:21:34 -07:00
Pierce Freeman	9af165c389	Clean setup.py imports	2023-06-07 17:27:36 -07:00
Pierce Freeman	494b2aa486	Add notes to github action workflow	2023-06-07 17:06:12 -07:00
Pierce Freeman	ea2ed88623	Refactor and clean of setup.py	2023-06-02 18:25:07 -07:00
Pierce Freeman	9fc9820a5b	Strip cuda name from torch version	2023-06-02 18:25:07 -07:00
Pierce Freeman	5e4699782a	Allow fallback install	2023-06-02 18:25:07 -07:00
Pierce Freeman	0e7769c813	Guessing wheel URL	2023-06-02 18:25:07 -07:00
Pierce Freeman	e1faefce9d	Raise cuda error on build	2023-06-02 18:25:07 -07:00
Pierce Freeman	add4f0bc42	Scaffolding for wheel prototype	2023-06-02 18:25:07 -07:00
Max H. Gerlach	31f78a9814	Allow adding an optional local version to the package version	2023-05-19 17:27:41 +02:00
Tri Dao	eff9fe6b80	Add ninja to pyproject.toml build-system, bump to v1.0.5	2023-05-12 14:20:31 -07:00
Tri Dao	ad113948a6	[Docs] Clearer error message for bwd d > 64, bump to v1.0.4	2023-04-26 09:19:48 -07:00
Tri Dao	fbbb107848	Bump version to v1.0.3.post0	2023-04-21 13:37:23 -07:00
Tri Dao	67ef5d28df	Bump version to 1.0.3	2023-04-21 12:04:53 -07:00
Tri Dao	df1344f866	Bump to v1.0.2	2023-04-15 22:19:31 -07:00
Pavel Shvets	72629ac9ba	add missed module	2023-04-14 20:08:24 +03:00
Tri Dao	853ff72963	Bump version to v1.0.1, fix Cutlass version	2023-04-12 10:05:01 -07:00
Tri Dao	74af023316	Bump version to 1.0.0	2023-04-11 23:32:35 -07:00
Tri Dao	dc08ea1c33	Support H100 for other CUDA extensions	2023-03-15 16:59:27 -07:00
Tri Dao	1b18f1b7a1	Support H100	2023-03-15 14:59:02 -07:00
Tri Dao	33e0860c9c	Bump to v0.2.8	2023-01-19 13:17:19 -08:00
Tri Dao	d509832426	[Compilation] Add _NO_HALF2 flags to be consistent with Pytorch `eb7b89771e/cmake/Dependencies.cmake (L1693)`	2023-01-12 22:15:41 -08:00
Tri Dao	ce26d3d73d	Bump to v0.2.7	2023-01-06 17:37:30 -08:00
Tri Dao	a6ec1782dc	Bump to v0.2.6	2022-12-27 22:05:20 -08:00
Tri Dao	1bc6e5b09c	Bump to v0.2.5	2022-12-21 14:33:18 -08:00
Tri Dao	04c4c6106e	Bump to v0.2.4	2022-12-14 14:49:26 -08:00
Tri Dao	a1a5d2ee49	Bump to v0.2.3	2022-12-13 01:37:02 -08:00
Tri Dao	d95ee1a95d	Speed up compilation by splitting into separate .cu files	2022-11-25 16:30:18 -08:00
Tri Dao	054816177e	Bump version to 0.2.1	2022-11-20 22:35:59 -08:00
Tri Dao	d6ef701aa9	Set version to 0.2.0 (instead of 0.2)	2022-11-15 14:15:05 -08:00
Tri Dao	4040256b5e	Update pip install instructions, bump to 0.2	2022-11-15 14:10:48 -08:00
Phil Wang	b0eac3297f	allow for uploading to pypi	2022-11-15 13:26:55 -08:00
Eric Engelhart	9b1b011bf6	Add C++17 arg to compiler, since C++17 features are used, fixes windows build	2022-10-04 21:31:39 -04:00
Gustaf	440e9c49f2	Add einops installation to setup.py	2022-07-03 02:04:24 -04:00
Tri Dao	2712aa4c8d	Support Turing mma instructions	2022-06-03 16:58:44 -07:00
Tri Dao	512c98ee05	Add Cutlass as submodule	2022-06-02 09:54:16 -07:00
Tri Dao	5a61cb7729	Rename src -> flash_attn	2022-06-01 18:50:26 -07:00

40 Commits