Commit Graph

8 Commits

Author SHA1 Message Date
Charlene Yang
bdf733be55
Add q, k, v descales to FA3 interface (#1210)
* add descale_q/k/v for fp8 fwd

Signed-off-by: Charlene Yang <8636796+cyanguwa@users.noreply.github.com>

* fix .apply args

Signed-off-by: Charlene Yang <8636796+cyanguwa@users.noreply.github.com>

---------

Signed-off-by: Charlene Yang <8636796+cyanguwa@users.noreply.github.com>
2024-09-09 21:53:52 -07:00
jayhshah
c92ca63268
FA3 FP8 qkv descales + restore max offset for h128 causal + added sync for producer WG (#1173) 2024-08-25 12:18:04 -07:00
Tri Dao
bafe253042 [FA3] Bwd 2024-08-01 01:57:06 -07:00
Tri Dao
3aae9c18c1 Revert "Changes For FP8 (#1075)"
This reverts commit 1899c970c8.
2024-07-25 01:28:44 -07:00
ganeshcolfax
1899c970c8
Changes For FP8 (#1075)
* adding files for fp8 changes.

* removed contiguous check.

* enable all tests except odd-seq-lengths, where it crashes now.

* undid clang formatting.

* change to correct tile size for headdim=128.

* fixed odd-seq-len-k.

* minor formatting.

* minor reformatting.

---------

Co-authored-by: Tri Dao <tridao@users.noreply.github.com>
2024-07-23 13:51:14 -07:00
Ying Zhang
dfe1a59e4b
Add var-seq-len to FA3 fp16 / bf16 fwd (#1072)
* fwd var-seq-len

* fixes

* benchmark

* fixes

---------

Co-authored-by: Tri Dao <tridao@users.noreply.github.com>
2024-07-22 21:32:41 -07:00
youkaichao
ef3e358a25
remove lambda (#1056) 2024-07-21 23:24:38 -07:00
Tri Dao
7f67966cc7 FA3 initial code release 2024-07-11 09:53:36 -07:00