cutlass/examples
Andrew Kerr fd7e058d0c
Added examples to enable the unity build (#102)
* Updated documentation of fused GEMM example and removed UNITY BUILD batch size. The default batch size when unity build is enabled tends to be favorable.
2020-06-17 07:09:18 -07:00
..
00_basic_gemm CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
01_cutlass_utilities CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
02_dump_reg_shmem CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
03_visualize_layout Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
04_tile_iterator CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
05_batched_gemm CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
06_splitK_gemm Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
07_volta_tensorop_gemm Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
08_turing_tensorop_gemm Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00
10_planar_complex CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
11_planar_complex_array CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
12_gemm_bias_relu CUTLASS 2.2 (#96) 2020-06-08 16:17:35 -07:00
13_fused_two_gemms Added examples to enable the unity build (#102) 2020-06-17 07:09:18 -07:00
common CUTLASS 2.0 (#62) 2019-11-19 16:55:34 -08:00
CMakeLists.txt Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (#100) 2020-06-15 10:47:01 -07:00