CUTLASS 1.3 Release - Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1. |
||
|---|---|---|
| .. | ||
| kernel | ||
| thread | ||
| gemm.h | ||
| split_complex_gemm.h | ||
| tensor_elementwise.h | ||
| tensor_foreach.h | ||
CUTLASS 1.3 Release - Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1. |
||
|---|---|---|
| .. | ||
| kernel | ||
| thread | ||
| gemm.h | ||
| split_complex_gemm.h | ||
| tensor_elementwise.h | ||
| tensor_foreach.h | ||