..
00_basic_gemm
Fix typos 2 ( #842 )
2023-03-09 23:22:56 -05:00
01_cutlass_utilities
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
02_dump_reg_shmem
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
03_visualize_layout
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
04_tile_iterator
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
05_batched_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
06_splitK_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
07_volta_tensorop_gemm
Fix typos 2 ( #842 )
2023-03-09 23:22:56 -05:00
08_turing_tensorop_gemm
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
09_turing_tensorop_conv2dfprop
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
10_planar_complex
CUTLASS 3.2 ( #1024 )
2023-08-07 20:50:32 -04:00
11_planar_complex_array
CUTLASS 3.2 ( #1024 )
2023-08-07 20:50:32 -04:00
12_gemm_bias_relu
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
13_two_tensor_op_fusion
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
14_ampere_tf32_tensorop_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
15_ampere_sparse_tensorop_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
16_ampere_tensorop_conv2dfprop
style(examples): typo ( #1080 )
2023-09-11 10:13:22 -04:00
17_fprop_per_channel_bias
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
18_ampere_fp64_tensorop_affine2_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
19_tensorop_canonical
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
20_simt_canonical
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
21_quaternion_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
22_quaternion_conv
CUTLASS 3.1 ( #915 )
2023-04-14 23:19:34 -04:00
23_ampere_gemm_operand_reduction_fusion
style(examples): typo ( #1080 )
2023-09-11 10:13:22 -04:00
24_gemm_grouped
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
25_ampere_fprop_mainloop_fusion
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
26_ampere_wgrad_mainloop_fusion
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
27_ampere_3xtf32_fast_accurate_tensorop_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
28_ampere_3xtf32_fast_accurate_tensorop_fprop
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
29_ampere_3xtf32_fast_accurate_tensorop_complex_gemm
CUTLASS 3.1 ( #915 )
2023-04-14 23:19:34 -04:00
30_wgrad_split_k
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
31_basic_syrk
Updates for 3.1 ( #932 )
2023-04-29 09:34:27 -04:00
32_basic_trmm
Updates for 3.1 ( #932 )
2023-04-29 09:34:27 -04:00
33_ampere_3xtf32_tensorop_symm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
34_transposed_conv2d
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
35_gemm_softmax
Increase max dynamic SMEM size in GemmSoftmax ( #903 )
2023-04-03 10:01:12 -04:00
36_gather_scatter_fusion
CUTLASS 3.2 ( #1024 )
2023-08-07 20:50:32 -04:00
37_gemm_layernorm_gemm_fusion
CUTLASS 3.0.0 ( #786 )
2023-01-23 20:55:28 -05:00
38_syr2k_grouped
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
39_gemm_permute
CUTLASS 3.2 ( #1024 )
2023-08-07 20:50:32 -04:00
40_cutlass_py
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
41_fused_multi_head_attention
Update fMHA kernels ( #992 )
2023-07-12 22:30:46 -04:00
42_ampere_tensorop_group_conv
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
43_ell_block_sparse_gemm
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
44_multi_gemm_ir_and_codegen
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
45_dual_gemm
Replace 0x1f with 0xffffffff in __shfl_sync ( #1097 )
2023-09-18 19:58:19 -04:00
46_depthwise_simt_conv2dfprop
CUTLASS 3.1 ( #915 )
2023-04-14 23:19:34 -04:00
47_ampere_gemm_universal_streamk
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00
48_hopper_warp_specialized_gemm
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00
49_hopper_gemm_with_collective_builder
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00
50_hopper_gemm_with_epilogue_swizzle
CUTLASS 3.2 ( #1024 )
2023-08-07 20:50:32 -04:00
51_hopper_gett
CUTLASS 3.1 ( #915 )
2023-04-14 23:19:34 -04:00
52_hopper_gather_scatter_fusion
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00
53_hopper_gemm_permute
CUTLASS 3.2 ( #1024 )
2023-08-07 20:50:32 -04:00
54_hopper_fp8_warp_specialized_gemm
CUTLASS 3.2.1 ( #1113 )
2023-09-26 17:24:26 -04:00
55_hopper_mixed_dtype_gemm
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00
60_cutlass_import
New updates for 2.11 ( #775 )
2023-01-20 16:32:57 -05:00
common
CUTLASS 3.1 ( #915 )
2023-04-14 23:19:34 -04:00
cute
CUTLASS 3.1 ( #915 )
2023-04-14 23:19:34 -04:00
python
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00
CMakeLists.txt
CUTLASS 3.3.0 ( #1167 )
2023-11-02 11:09:05 -04:00