cutlass/examples
HouQiming 96a11a1ef3
Removed trivial copy constructors on parameter classes to enable devi… (#366)
* Removed trivial copy constructors on parameter classes to enable device-side launch of CUTLASS kernels

* Added SFINAE to the `TensorRef(NonConstTensorRef const&)` constructor to avoid making it a copy-constructor for device code

* std => platform

* fix affine2

* really fix affine2

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2022-02-28 21:34:02 -05:00
..
00_basic_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
01_cutlass_utilities Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
02_dump_reg_shmem Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
03_visualize_layout CUTLASS 2.7 (#318) 2021-09-20 11:02:22 -07:00
04_tile_iterator Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
05_batched_gemm Make cutlass::gemm::device::GemmArray usable (#295) 2022-02-17 20:01:05 -05:00
06_splitK_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
07_volta_tensorop_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
08_turing_tensorop_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
09_turing_tensorop_conv2dfprop Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
10_planar_complex Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
11_planar_complex_array Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
12_gemm_bias_relu Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
13_two_tensor_op_fusion CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
14_ampere_tf32_tensorop_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
15_ampere_sparse_tensorop_gemm Updates to fused epilogue (#383) 2021-12-17 16:04:43 -05:00
16_ampere_tensorop_conv2dfprop [hardswish] correct implmentation (#403) 2022-02-09 14:28:53 -05:00
17_fprop_per_channel_bias Updates to fused epilogue (#383) 2021-12-17 16:04:43 -05:00
18_ampere_fp64_tensorop_affine2_gemm Removed trivial copy constructors on parameter classes to enable devi… (#366) 2022-02-28 21:34:02 -05:00
19_tensorop_canonical Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
20_simt_canonical Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
21_quaternion_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
22_quaternion_conv Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
23_ampere_gemm_operand_reduction_fusion example 23 gemm operand reduction fusion (#325) 2021-09-20 13:34:47 -07:00
24_gemm_grouped CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
25_ampere_fprop_mainloop_fusion CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
26_ampere_wgrad_mainloop_fusion CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
27_ampere_3xtf32_fast_accurate_tensorop_gemm CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
28_ampere_3xtf32_fast_accurate_tensorop_fprop CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
29_ampere_3xtf32_fast_accurate_tensorop_complex_gemm CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
common CUTLASS 2.0 (#62) 2019-11-19 16:55:34 -08:00
CMakeLists.txt Updates to fused epilogue (#383) 2021-12-17 16:04:43 -05:00