cutlass/test/unit/gemm/device
Ali Hassani d4be5ab5d7
Allow per-column bias in EpilogueTensorBroadcast (#1275)
* Allow per-column bias in EpilogueTensorBroadcast

EpilogueTensorBroadcast only supports per-row vector broadcast, because
the bias stride is hardcoded.

It can easily support both if the bias stride is made conditional, and
the original behavior is maintained by defaulting to per-row.

* Add unit test for EpilogueTensorBroadcast with per-col bias

---------

Co-authored-by: Ali Hassani <ahassanijr@gmail.com>
Co-authored-by: Ali Hassani <ali@hippoml.com>
2024-01-04 12:48:31 -05:00
..
CMakeLists.txt CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
default_gemm_configuration.hpp CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
gemm_b1t_b1n_s32n_tensor_op_s32_sm75.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_b1t_b1n_s32n_tensor_op_s32_sm80.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_b1t_b1n_s32n_wmma_tensor_op_s32_sm75.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_b1t_b1n_s32t_tensor_op_s32_sm75.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_b1t_b1n_s32t_tensor_op_s32_sm80.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_b1t_b1n_s32t_wmma_tensor_op_s32_sm75.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_bf16n_bf16n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_bf16t_bf16t_bf16t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf32n_cf32t_cf32t_tensor_op_tf32_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf32t_cf32n_cf32t_tensor_op_tf32_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf64n_cf64t_cf64t_tensor_op_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf64n_cf64t_cf64t_tensor_op_f64_gaussian_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_cf64n_cf64t_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf64n_cf64t_cf64t_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_cf64t_cf64n_cf64t_tensor_op_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf64t_cf64n_cf64t_tensor_op_f64_gaussian_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_cf64t_cf64n_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_cf64t_cf64n_cf64t_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_f16n_f16n_f16n_direct_store_tensor_op_f32_sm80.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_f16n_f16n_f16n_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f16n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f16t_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f16t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f16t_tensor_op_f32_sparse_sm80.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_f16n_f16n_f16t_volta_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f16t_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f16t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32n_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32t_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32t_volta_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16n_f32t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16n_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_tensor_op_f16_slicedk_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_tensor_op_f16_slicedk_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_tensor_op_f16_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_tensor_op_f16_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_tensor_op_f16_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_volta_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f16t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f32n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f32t_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f32t_volta_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16n_f16t_f32t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16n_singlestage_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16n_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_singlestage_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_tensor_op_f16_broadcast_sm80.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_f16t_f16n_f16t_tensor_op_f16_slicedk_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_tensor_op_f16_slicedk_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_tensor_op_f16_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_tensor_op_f16_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_tensor_op_f16_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_volta_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f16t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32t_singlestage_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32t_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32t_volta_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16n_f32t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f16n_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f16n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f16t_wmma_tensor_op_f16_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f16t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32n_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32n_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32t_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32t_volta_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f16t_f16t_f32t_wmma_tensor_op_f32_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f32n_f32n_f32t_tensor_op_bf16_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f32n_f32n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f32n_f32n_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f32n_f32t_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f32t_f32n_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f32t_f32t_f32t_tensor_op_f32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f64n_f64t_f64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f64n_f64t_f64t_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_f64t_f64n_f64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_f64t_f64n_f64t_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_grouped_scheduler_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_grouped_sm80.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
gemm_planar_complex_f16_f16_f32_tensor_op_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_planar_complex_f16_f16_f32_tensor_op_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_planar_complex_f16_f16_f32_tensor_op_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s4n_s4t_s4n_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s4n_s4t_s4n_tensor_op_s32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s4t_s4n_s4n_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s4t_s4n_s4n_tensor_op_s32_sm80.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
gemm_s4t_s4n_s4t_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s4t_s4n_s4t_tensor_op_s32_sm80.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
gemm_s4t_s4n_s32n_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s4t_s4n_s32n_tensor_op_s32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s4t_s4n_s32n_wmma_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s4t_s4n_s32t_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s4t_s4n_s32t_tensor_op_s32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s4t_s4n_s32t_tensor_op_s32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s4t_s4n_s32t_wmma_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s8n_s8t_s8n_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s8n_s8t_s8n_tensor_op_s32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_f16t_tensor_op_s32_sm80.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
gemm_s8t_s8n_s8n_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s8t_s8n_s8n_tensor_op_s32_sm80.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
gemm_s8t_s8n_s8n_wmma_tensor_op_s32_sm72.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_s8t_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s8t_s8n_s8t_tensor_op_s32_sm80.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
gemm_s8t_s8n_s8t_wmma_tensor_op_s32_sm72.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_s32n_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s8t_s8n_s32n_tensor_op_s32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_s32n_wmma_tensor_op_s32_sm72.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_s32t_tensor_op_s32_sm75.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_s8t_s8n_s32t_tensor_op_s32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_s32t_tensor_op_s32_sparse_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_s8t_s8n_s32t_wmma_tensor_op_s32_sm72.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_splitk_serial_tensor_op_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_splitk_simt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_splitk_tensor_op_sm70.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_splitk_tensor_op_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_testbed_3x_evt.hpp CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
gemm_testbed_3x_tensor_broadcast.hpp Allow per-column bias in EpilogueTensorBroadcast (#1275) 2024-01-04 12:48:31 -05:00
gemm_testbed_3x.hpp CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
gemm_tf32n_tf32n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_tf32n_tf32t_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_tf32t_tf32n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_tf32t_tf32t_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_u8t_u8n_s32t_wmma_tensor_op_s32_sm72.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_bf16t_s8n_bf16t_mixed_input_tensor_op_f32_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
gemm_universal_cf32n_cf32n_cf32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_cf64n_cf64t_cf64t_tensor_op_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_cf64n_cf64t_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_f16n_f16t_f32n_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_f16n_f16t_f32t_tensor_op_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_f16t_s8n_f16t_mixed_input_tensor_op_f16_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
gemm_universal_f16t_u8n_f16t_mixed_input_tensor_op_f16_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
gemm_universal_s8t_bf16n_bf16t_mixed_input_tensor_op_f32_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
gemm_universal_s8t_f16n_f16t_mixed_input_tensor_op_f16_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
gemm_universal_u8t_f16n_f16t_mixed_input_tensor_op_f16_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
gemm_with_broadcast_f16n_f16n_f16n_tensorop_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_with_reduction_f16n_f16n_f16n_tensorop_f32_sm75.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_with_reduction_f16t_f16n_f16n_tensorop_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemv.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
hemm_cf32h_cf32n_tensor_op_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
hemm_cf32h_cf32n_tensor_op_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
hemm_cf32h_cf32n_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
hemm_cf32h_cf32n_tensor_op_fast_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
hemm_cf64_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
hemm_cf64h_cf64n_cf64n_tensor_op_ls_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
hemm_cf64h_cf64n_cf64n_tensor_op_ls_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
hemm_cf64h_cf64n_cf64n_tensor_op_rs_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
her2k_cf32h_cf32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
her2k_cf32h_cf32n_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
her2k_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
her2k_cf64h_cf64n_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
her2k_cf64n_cf64n_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
her2k_cf64n_cf64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
her2k_cf64n_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
herk_cf32h_cf32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
herk_cf32h_cf32n_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
herk_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
herk_cf64h_cf64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
multistage_testbed_interleaved.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
multistage_testbed.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
rank_2k_grouped_scheduler_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_cgemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_cgemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_cgemm_nt_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_cgemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_cgemm_tn_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_cgemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_dgemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_dgemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_dgemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_dgemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_f8gemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_hgemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_hgemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_hgemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_hgemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_igemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_igemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_igemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_igemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_int8_igemm_sm61_perf.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_int8_igemm_sm61_sliced_k.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_int8_igemm_sm61.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_qgemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_qgemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_qgemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_qgemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sgemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sgemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sgemm_nt_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sgemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sgemm_tn_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sgemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_sm50.py New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_zgemm_nn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_zgemm_nt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_zgemm_tn_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
simt_zgemm_tt_sm50.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
sm50_gemm_f32_f32_f32_simt.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm50_gemm_f64_f64_f64_simt.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm61_gemm_s8_s8_s32_simt.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm80_gemm_f16_f16_f32_tensor_op_f32.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm80_gemm_f32_f32_f32_simt.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm80_gemm_f64_f64_f64_simt.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm80_gemm_f64_f64_f64_tensor_op_f64.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm80_gemm_s8_s8_s32_tensor_op.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm80_gemm_tf32_tf32_f32_tensor_op_f32.cu CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
sm90_evt_operations.hpp CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
sm90_gemm_bf16_bf16_bf16_alignx_tensor_op_f32_warpspecialized_cooperative.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_bf16_bf16_bf16_alignx_tensor_op_f32_warpspecialized_pingpong.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_bf16_bf16_bf16_alignx_tensor_op_f32_warpspecialized.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_bf16_bf16_bf16_alignx_tensor_op_f32.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_bf16_bf16_bf16_tensor_op_f32.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
sm90_gemm_f8_f8_bf16_tensor_op_fp32_evt.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f8_f8_bf16_tensor_op_fp32.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f8_f8_f8_tensor_op_fp32_evt.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f8_f8_f8_tensor_op_fp32.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f8_f8_f32_tensor_op_f32_cluster_warpspecialized_cooperative_evt.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f8_f8_f32_tensor_op_f32_cluster_warpspecialized_cooperative.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f8_f8_f32_tensor_op_f32_cooperative_stream_k.cu CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
sm90_gemm_f8_f8_f32_tensor_op_f32_rs_cluster_warpspecialized_cooperative.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_f8_f8_f32_tensor_op_fp32.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_alignx_tensor_op_f32_warpspecialized_cooperative.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_f16_f16_f16_alignx_tensor_op_f32_warpspecialized_pingpong.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_f16_f16_f16_alignx_tensor_op_f32_warpspecialized.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_f16_f16_f16_alignx_tensor_op_f32.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_unspecialized.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative_aux_load.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative_bias_elementwise.cu CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative_dag.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative_reduce.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative_row_broadcast.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_pingpong_aux_load.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_pingpong_bias_elementwise.cu CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_pingpong_dag.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_pingpong_reduce.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_pingpong_row_broadcast.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_pingpong.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_tensor_op_f32_cooperative_stream_k.cu CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
sm90_gemm_f16_f16_f16_tensor_op_f32_tensor_broadcast.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_f16_f16_f16_tensor_op.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
sm90_gemm_f16_f16_f32_tensor_op_f32_rs_cluster_warpspecialized_cooperative.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_f32_f32_f32_tensor_op_f32_tensor_broadcast.cu Allow per-column bias in EpilogueTensorBroadcast (#1275) 2024-01-04 12:48:31 -05:00
sm90_gemm_f32_f32_f32_tensor_op_f32.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm90_gemm_s8_s8_s8_alignx_tensor_op_s32_warpspecialized_cooperative.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_s8_s8_s8_alignx_tensor_op_s32_warpspecialized_pingpong.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_s8_s8_s8_alignx_tensor_op_s32_warpspecialized.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_s8_s8_s8_alignx_tensor_op_s32.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_s8_s8_s8_tensor_op_s32_tensor_broadcast.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
sm90_gemm_s8_s8_s8_tensor_op_s32.cu More updates for 3.1 (#958) 2023-05-24 10:17:16 -04:00
sm90_gemm_stream_k_scheduler.cu CUTLASS 3.4.0 (#1286) 2023-12-29 15:21:31 -05:00
sm90_gemm_tf32_tf32_f32_alignx_tensor_op_f32_warpspecialized_cooperative.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_tf32_tf32_f32_alignx_tensor_op_f32_warpspecialized_pingpong.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_tf32_tf32_f32_alignx_tensor_op_f32_warpspecialized.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_tf32_tf32_f32_alignx_tensor_op_f32.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
sm90_gemm_tf32_tf32_f32_tensor_op_f32_gmma_rs_cluster_warpspecialized.cu CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
sm90_gemm_tf32_tf32_f32_tensor_op_f32.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
symm_cf32n_cf32n_tensor_op_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_cf32n_cf32n_tensor_op_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_cf32n_cf32n_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_cf32n_cf32n_tensor_op_fast_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_cf64_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
symm_cf64n_cf64n_cf64n_tensor_op_ls_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_cf64n_cf64n_cf64n_tensor_op_ls_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_cf64n_cf64n_cf64n_tensor_op_rs_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f32n_f32n_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f32n_f32n_tensor_op_fast_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f32t_f32t_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64_f64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
symm_f64n_f64n_tensor_op_f64_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64n_f64n_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64n_f64t_tensor_op_f64_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64n_f64t_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64t_f64n_tensor_op_f64_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64t_f64n_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64t_f64t_tensor_op_f64_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_f64t_f64t_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_tf32n_f32n_tensor_op_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_tf32n_f32n_tensor_op_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_tf32t_f32t_tensor_op_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf32n_cf32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf32n_cf32n_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf32n_cf32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf32n_cf32t_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
syr2k_cf64n_cf64n_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf64n_cf64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf64n_cf64t_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf64n_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf64t_cf64n_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_cf64t_cf64t_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f32n_f32n_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f32t_f32n_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64_f64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
syr2k_f64n_f64n_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64n_f64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64n_f64t_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64n_f64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64t_f64n_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64t_f64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_f64t_f64t_tensor_op_f64_grouped_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_tf32n_f32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syr2k_tf32t_f32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf32n_cf32n_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf32n_cf32n_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf32n_cf32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf32n_cf32t_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
syrk_cf64n_cf64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf64n_cf64t_tensor_op_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_cf64n_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_f32n_f32t_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_f32t_f32t_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_f64_f64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
syrk_f64n_f64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_f64t_f64n_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_tf32n_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
syrk_tf32t_f32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_complex.h CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
testbed_gemm_with_broadcast.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
testbed_gemm_with_reduction.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
testbed_grouped_rank_2k_scheduler.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_grouped_rank_2k.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_grouped_scheduler.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_grouped.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_interleaved.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
testbed_planar_complex.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_rank2k_universal.h CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
testbed_rank_k_universal.h CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
testbed_sanity.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_sparse.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
testbed_splitk.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed_symm_universal.h CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
testbed_trmm_universal.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
testbed_universal.h Support for Mixed Input TensorOp (#1084) 2023-09-27 11:18:30 -04:00
testbed_utils.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
testbed.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
trmm_cf32n_cf32n_cf32t_tensor_op_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_cf32n_cf32n_cf32t_tensor_op_fast_f32_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_cf64_cf64_cf64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
trmm_cf64n_cf64n_cf64t_tensor_op_f64_gaussian_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_cf64n_cf64n_cf64t_tensor_op_f64_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f32n_f32t_f32t_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f32n_f32t_f32t_tensor_op_fast_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f32t_f32n_f32n_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f32t_f32n_f32t_tensor_op_fast_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f64_f64_f64_tensor_op_f64_sm90.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
trmm_f64n_f64n_f64t_tensor_op_f64_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f64n_f64n_f64t_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f64n_f64t_f64t_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f64t_f64t_f64n_tensor_op_f64_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_f64t_f64t_f64n_tensor_op_f64_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_tf32n_tf32t_f32t_tensor_op_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_tf32n_tf32t_f32t_tensor_op_f32_rs_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_tf32t_tf32n_f32n_tensor_op_f32_ls_sm80.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
trmm_tf32t_tf32n_f32t_tensor_op_f32_ls_sm80.cu CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00