cutlass/examples
Andrew Kerr c53f3339bb
CUTLASS 2.3 initial commit (#134)
CUTLASS 2.3 adds GEMMs targeting Sparse Tensor Cores on the NVIDIA Ampere Architecture, fast SGEMM, and small matrix classes, bug fixes, and performance enhancements.
2020-09-23 14:00:58 -07:00
..
00_basic_gemm CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
01_cutlass_utilities CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
02_dump_reg_shmem CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00
03_visualize_layout Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
04_tile_iterator CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
05_batched_gemm CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
06_splitK_gemm Typoes (#107) 2020-07-13 14:25:52 -07:00
07_volta_tensorop_gemm Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
08_turing_tensorop_gemm Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
10_planar_complex CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00
11_planar_complex_array CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00
12_gemm_bias_relu Typoes (#107) 2020-07-13 14:25:52 -07:00
13_fused_two_gemms CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00
14_ampere_tf32_tensorop_gemm CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00
15_ampere_sparse_tensorop_gemm CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00
common CUTLASS 2.0 (#62) 2019-11-19 16:55:34 -08:00
CMakeLists.txt CUTLASS 2.3 initial commit (#134) 2020-09-23 14:00:58 -07:00