cutlass/examples
Masahiro Masuda 0e71d9b450
Transposed conv2d and wgrad split k examples (#413)
* add split k wgrad example

* wgrad done

* begin transposed conv2d example

* update transposed conv2d example and add ref check

* update doc for conv2d transpose example

* add license

* add wgrad doc

* more clarification on GEMM output type

* typo fix

* clean up indent

* address comments

* rename example numbers to 34 and 35

* GEMM -> Implicit GEMM

* Revert "rename example numbers to 34 and 35"

This reverts commit 551a808c227216e9e38d4472ba8ff020557b8500.

* transposed_conv2d is 34

* add compiler and device version check to exit gracefully

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2022-03-23 14:52:54 -04:00
..
00_basic_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
01_cutlass_utilities Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
02_dump_reg_shmem Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
03_visualize_layout CUTLASS 2.7 (#318) 2021-09-20 11:02:22 -07:00
04_tile_iterator Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
05_batched_gemm Make cutlass::gemm::device::GemmArray usable (#295) 2022-02-17 20:01:05 -05:00
06_splitK_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
07_volta_tensorop_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
08_turing_tensorop_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
09_turing_tensorop_conv2dfprop Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
10_planar_complex Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
11_planar_complex_array Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
12_gemm_bias_relu Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
13_two_tensor_op_fusion CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
14_ampere_tf32_tensorop_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
15_ampere_sparse_tensorop_gemm Updates to fused epilogue (#383) 2021-12-17 16:04:43 -05:00
16_ampere_tensorop_conv2dfprop [hardswish] correct implmentation (#403) 2022-02-09 14:28:53 -05:00
17_fprop_per_channel_bias Updates to fused epilogue (#383) 2021-12-17 16:04:43 -05:00
18_ampere_fp64_tensorop_affine2_gemm Removed trivial copy constructors on parameter classes to enable devi… (#366) 2022-02-28 21:34:02 -05:00
19_tensorop_canonical Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
20_simt_canonical Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
21_quaternion_gemm Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
22_quaternion_conv Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00
23_ampere_gemm_operand_reduction_fusion Example 23 - Passing correct alpha and beta values with --parallel-split-k (#424) 2022-03-22 12:27:34 -04:00
24_gemm_grouped CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
25_ampere_fprop_mainloop_fusion CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
26_ampere_wgrad_mainloop_fusion CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
27_ampere_3xtf32_fast_accurate_tensorop_gemm CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
28_ampere_3xtf32_fast_accurate_tensorop_fprop CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
29_ampere_3xtf32_fast_accurate_tensorop_complex_gemm CUTLASS 2.8 (#363) 2021-11-19 13:26:35 -08:00
30_wgrad_split_k Transposed conv2d and wgrad split k examples (#413) 2022-03-23 14:52:54 -04:00
34_transposed_conv2d Transposed conv2d and wgrad split k examples (#413) 2022-03-23 14:52:54 -04:00
common CUTLASS 2.0 (#62) 2019-11-19 16:55:34 -08:00
CMakeLists.txt Transposed conv2d and wgrad split k examples (#413) 2022-03-23 14:52:54 -04:00