| .. |
|
00_basic_gemm
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
01_cutlass_utilities
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
02_dump_reg_shmem
|
Updates for CUTLASS 3.4.1 (#1346)
|
2024-02-15 15:48:34 -05:00 |
|
03_visualize_layout
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
04_tile_iterator
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
05_batched_gemm
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
06_splitK_gemm
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
07_volta_tensorop_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
08_turing_tensorop_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
09_turing_tensorop_conv2dfprop
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
10_planar_complex
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
11_planar_complex_array
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
12_gemm_bias_relu
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
13_two_tensor_op_fusion
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
14_ampere_tf32_tensorop_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
15_ampere_sparse_tensorop_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
16_ampere_tensorop_conv2dfprop
|
Updates for CUTLASS 3.5.0 (#1468)
|
2024-04-11 21:33:40 -04:00 |
|
17_fprop_per_channel_bias
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
18_ampere_fp64_tensorop_affine2_gemm
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
19_tensorop_canonical
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
20_simt_canonical
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
21_quaternion_gemm
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
22_quaternion_conv
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
23_ampere_gemm_operand_reduction_fusion
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
24_gemm_grouped
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
25_ampere_fprop_mainloop_fusion
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
26_ampere_wgrad_mainloop_fusion
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
27_ampere_3xtf32_fast_accurate_tensorop_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
28_ampere_3xtf32_fast_accurate_tensorop_fprop
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
29_ampere_3xtf32_fast_accurate_tensorop_complex_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
30_wgrad_split_k
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
31_basic_syrk
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
32_basic_trmm
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
33_ampere_3xtf32_tensorop_symm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
34_transposed_conv2d
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
35_gemm_softmax
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
36_gather_scatter_fusion
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
37_gemm_layernorm_gemm_fusion
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
38_syr2k_grouped
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
39_gemm_permute
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
40_cutlass_py
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
41_fused_multi_head_attention
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
42_ampere_tensorop_group_conv
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
43_ell_block_sparse_gemm
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
44_multi_gemm_ir_and_codegen
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
45_dual_gemm
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
46_depthwise_simt_conv2dfprop
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
47_ampere_gemm_universal_streamk
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
48_hopper_warp_specialized_gemm
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
49_hopper_gemm_with_collective_builder
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
50_hopper_gemm_with_epilogue_swizzle
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
51_hopper_gett
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
52_hopper_gather_scatter_fusion
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
53_hopper_gemm_permute
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
54_hopper_fp8_warp_specialized_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
55_hopper_mixed_dtype_gemm
|
Improve sm90 mixed dtype kernel (#1883)
|
2024-10-17 20:06:38 -04:00 |
|
56_hopper_ptr_array_batched_gemm
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
57_hopper_grouped_gemm
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
58_ada_fp8_gemm
|
CUTLASS 3.5.1 (#1623)
|
2024-07-29 08:46:24 -04:00 |
|
59_ampere_gather_scatter_conv
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
60_cutlass_import
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
61_hopper_gemm_with_topk_and_softmax
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
62_hopper_sparse_gemm
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
63_hopper_gemm_with_weight_prefetch
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
common
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
cute
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |
|
python
|
Updates for 3.4 release. (#1305)
|
2024-01-16 13:42:51 -05:00 |
|
CMakeLists.txt
|
CUTLASS 3.6.0 (#1850)
|
2024-10-09 15:33:27 -04:00 |