cutlass/examples/41_fused_multi_head_attention
dan_the_3rd f303889ed9
fMHA: Sync FW with xFormers (#828)
* fMHA: Add support for bias+dropout in FW

* Remove 'getMaximumSharedMemoryPerBlockKb'

* fix comments

---------

Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-02-22 23:25:31 -05:00
..
epilogue fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
gemm fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
iterators xFormer updates to fMHA FW (#773) 2023-02-08 23:00:10 -05:00
transform fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
CMakeLists.txt New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
debug_utils.h fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
default_fmha_grouped.h fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
fmha_grouped_problem_visitor.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
fmha_grouped.h fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
fused_multihead_attention_fixed_seqlen.cu fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00
fused_multihead_attention_variable_seqlen.cu xFormer updates to fMHA FW (#773) 2023-02-08 23:00:10 -05:00
gemm_kernel_utils.h xFormer updates to fMHA FW (#773) 2023-02-08 23:00:10 -05:00
kernel_forward.h fMHA: Sync FW with xFormers (#828) 2023-02-22 23:25:31 -05:00