flash-attention/csrc/stream_attn
2022-05-20 14:21:58 -07:00
..
src First release 2022-05-20 14:21:58 -07:00
fmha_api.cpp First release 2022-05-20 14:21:58 -07:00
README.md First release 2022-05-20 14:21:58 -07:00
setup.py First release 2022-05-20 14:21:58 -07:00

Our implementation uses Apex's FMHA code as a starting point.

We thank Young-jun Ko for the in-depth explanation of his FMHA implementation and for his thoughtful answers to our questions about CUDA.