![]() CUTLASS 1.3 Release - Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1. |
||
---|---|---|
.. | ||
cutlass-performance-plot.png | ||
cutlass-threadblock-gemm.png | ||
cutlass-tile-iteration.png | ||
cutlass-tile-structure.png | ||
cutlass-warp-thread-tile-structure.png | ||
gemm-hierarchy-with-epilogue-no-labels.png | ||
gemm-hierarchy-with-epilogue.png | ||
gemm-structural-components.png |