cutlass/examples/42_fused_multi_head_attention
dan_the_3rd 4db6a6140e
ex42: Fused MHA imported from xFormers (#662)
* ex42: Fused MHA imported from xFormers

* Remove std:: references

* Support K>128 in the example

* Support causal option

* Support different head size for V, and different seqlength for KV

* Update FLOPS counter

* Remove bit_cast

* fix build: Replace M_LOG2E

* Add doc

* Revert "Remove bit_cast"

This reverts commit 9662fa86bb7c57c1a015ac0bf52cb52940fbbf80.

* Explicit casts to int32_t for windows build

Co-authored-by: danthe3rd <danthe3rd>
2022-10-17 10:49:33 -04:00
..
gemm ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
iterators ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
attention_scaling_coefs_updater.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
CMakeLists.txt ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
debug_utils.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
epilogue_pipelined.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
epilogue_rescale_output.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
epilogue_thread_apply_logsumexp.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
find_default_mma.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
fused_multihead_attention.cu ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
gemm_kernel_utils.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
kernel_forward.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00
mma_from_smem.h ex42: Fused MHA imported from xFormers (#662) 2022-10-17 10:49:33 -04:00