..
__init__.py
FA3 initial code release
2024-07-11 09:53:36 -07:00
benchmark_attn.py
Add var-seq-len to FA3 fp16 / bf16 fwd ( #1072 )
2024-07-22 21:32:41 -07:00
benchmark_flash_attention_fp8.py
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
benchmark_flash_attention.py
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
block_info.h
FA3 initial code release
2024-07-11 09:53:36 -07:00
epilogue_fwd_sm90_tma.hpp
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash_api.cpp
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash_attn_interface.py
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash_bwd_hdim64_fp16_sm90.cu
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_bwd_hdim128_fp16_sm90.cu
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_bwd_hdim256_fp16_sm90.cu
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_bwd_kernel.h
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_bwd_launch_template.h
Remove torchlib dependency from cpp files ( #1083 )
2024-07-22 16:47:09 -07:00
flash_bwd_preprocess_kernel.h
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_fwd_hdim64_bf16_sm90.cu
[FA3] BF16 forward
2024-07-14 23:39:46 -07:00
flash_fwd_hdim64_fp8_sm90.cu
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash_fwd_hdim64_fp16_sm90.cu
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_fwd_hdim128_bf16_sm90.cu
[FA3] BF16 forward
2024-07-14 23:39:46 -07:00
flash_fwd_hdim128_fp8_sm90.cu
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash_fwd_hdim128_fp16_sm90.cu
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_fwd_hdim256_bf16_sm90.cu
[FA3] BF16 forward
2024-07-14 23:39:46 -07:00
flash_fwd_hdim256_fp8_sm90.cu
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash_fwd_hdim256_fp16_sm90.cu
FA3 initial code release
2024-07-11 09:53:36 -07:00
flash_fwd_kernel.h
Add var-seq-len to FA3 fp16 / bf16 fwd ( #1072 )
2024-07-22 21:32:41 -07:00
flash_fwd_launch_template.h
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
flash.h
Add var-seq-len to FA3 fp16 / bf16 fwd ( #1072 )
2024-07-22 21:32:41 -07:00
kernel_traits.h
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
mainloop_fwd_sm90_tma_gmma_ws.hpp
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
named_barrier.hpp
[FA3] BF16 forward
2024-07-14 23:39:46 -07:00
seq_len.h
Add var-seq-len to FA3 fp16 / bf16 fwd ( #1072 )
2024-07-22 21:32:41 -07:00
setup.py
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
softmax.h
FA3 initial code release
2024-07-11 09:53:36 -07:00
static_switch.h
Add var-seq-len to FA3 fp16 / bf16 fwd ( #1072 )
2024-07-22 21:32:41 -07:00
test_flash_attn.py
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00
tile_scheduler.hpp
[FA3] BF16 forward
2024-07-14 23:39:46 -07:00
utils.h
Changes For FP8 ( #1075 )
2024-07-23 13:51:14 -07:00