![]() CUTLASS 1.3 Release - Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1. |
||
---|---|---|
.. | ||
batched_reduction_testbed.h | ||
batched_reduction.cu | ||
mixed_batched_reduction.cu | ||
test_batched_reduction.h |
![]() CUTLASS 1.3 Release - Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1. |
||
---|---|---|
.. | ||
batched_reduction_testbed.h | ||
batched_reduction.cu | ||
mixed_batched_reduction.cu | ||
test_batched_reduction.h |