flash-attention/csrc/flash_attn
Grigory Sizov af01244ddd
Add split-kv and M<->H swap to varlen forward decoding attention (#754)
* Add split-k, M<->H to varseq path

* skip M<->H when dropout>0, fix LSE
2024-01-21 15:28:36 -08:00
..
src Update cutlass to v3.4.0 2024-01-20 22:30:06 -08:00
flash_api.cpp Add split-kv and M<->H swap to varlen forward decoding attention (#754) 2024-01-21 15:28:36 -08:00