cutlass/examples
Haicheng Wu 8b42e751c6
streamk paper link (#765)
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-01-10 22:10:43 -05:00
..
00_basic_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
01_cutlass_utilities CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
02_dump_reg_shmem CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
03_visualize_layout releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
04_tile_iterator Remove redundant <fstream> includes (#563) 2022-07-19 15:23:54 -04:00
05_batched_gemm releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
06_splitK_gemm fix: fix types in example 06 (#587) 2022-07-29 12:46:06 -04:00
07_volta_tensorop_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
08_turing_tensorop_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
09_turing_tensorop_conv2dfprop Remove redundant <fstream> includes (#563) 2022-07-19 15:23:54 -04:00
10_planar_complex Remove redundant <fstream> includes (#563) 2022-07-19 15:23:54 -04:00
11_planar_complex_array Remove redundant <fstream> includes (#563) 2022-07-19 15:23:54 -04:00
12_gemm_bias_relu CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
13_two_tensor_op_fusion Add residual support for shmem staging iterator used in back-to-back GEMM fusion. This allows support of problem_size_0_n that is not multiple of 32. (#590) 2022-08-15 11:19:24 -04:00
14_ampere_tf32_tensorop_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
15_ampere_sparse_tensorop_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
16_ampere_tensorop_conv2dfprop CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
17_fprop_per_channel_bias CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
18_ampere_fp64_tensorop_affine2_gemm CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
19_tensorop_canonical CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
20_simt_canonical CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
21_quaternion_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
22_quaternion_conv Remove redundant <fstream> includes (#563) 2022-07-19 15:23:54 -04:00
23_ampere_gemm_operand_reduction_fusion releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
24_gemm_grouped CUTLASS 2.10 updates (#622) 2022-09-12 21:26:30 -04:00
25_ampere_fprop_mainloop_fusion CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
26_ampere_wgrad_mainloop_fusion CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
27_ampere_3xtf32_fast_accurate_tensorop_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
28_ampere_3xtf32_fast_accurate_tensorop_fprop CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
29_ampere_3xtf32_fast_accurate_tensorop_complex_gemm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
30_wgrad_split_k releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
31_basic_syrk [examples] Fix typos in SYRK and TRMM examples (#507) 2022-06-03 22:52:41 -04:00
32_basic_trmm [examples] Fix typos in SYRK and TRMM examples (#507) 2022-06-03 22:52:41 -04:00
33_ampere_3xtf32_tensorop_symm CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
34_transposed_conv2d CUTLASS 2.10 (#615) 2022-09-03 18:48:46 -04:00
35_gemm_softmax releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
36_gather_scatter_fusion CUTLASS 2.10 updates (#622) 2022-09-12 21:26:30 -04:00
37_gemm_layernorm_gemm_fusion releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
38_syr2k_grouped CUTLASS 2.10 updates (#622) 2022-09-12 21:26:30 -04:00
39_gemm_permute releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
40_cutlass_py Make Python interface work for non-SM80 targets (#726) 2022-12-07 21:53:33 -05:00
41_fused_multi_head_attention minor chagnes (#730) 2022-12-10 14:44:53 -05:00
42_ampere_tensorop_group_conv releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
43_ell_block_sparse_gemm releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
44_multi_gemm_ir_and_codegen releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
45_dual_gemm releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
46_depthwise_simt_conv2dfprop releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
47_ampere_gemm_universal_streamk streamk paper link (#765) 2023-01-10 22:10:43 -05:00
common streamk example and performance tuning (#760) 2023-01-10 16:10:02 -05:00
CMakeLists.txt streamk example and performance tuning (#760) 2023-01-10 16:10:02 -05:00