cutlass/examples/13_two_tensor_op_fusion/kernel
Aleksandr Pivovar 4a68cf748e
added support of b2b bmm (#849)
* added support of b2b bmm

* fixed arguments and params structures

* added batch_count argument

* removed SplitKSerial and added new test case with b2b bmm

* fixed support of Kbatched and added new test case with batch stride

* added batch support for bias and scale

* make test

* small changes

---------

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-04-14 23:20:02 -04:00
..
b2b_gemm.h added support of b2b bmm (#849) 2023-04-14 23:20:02 -04:00
b2b_implicit_gemm_convolution.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_b2b_conv2d_fprop_sm75.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_b2b_conv2d_fprop_sm80.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_b2b_conv2d_fprop_smem_accumulator_sm75.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_b2b_conv2d_fprop_smem_accumulator_sm80.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_b2b_conv2d_fprop.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_b2b_gemm_smem_accumulator.h added support of b2b bmm (#849) 2023-04-14 23:20:02 -04:00
default_b2b_gemm.h added support of b2b bmm (#849) 2023-04-14 23:20:02 -04:00