Commit Graph

5 Commits

Author SHA1 Message Date
Andrew Kerr
877bdcace6
Cutlass 1.3 Release (#42)
CUTLASS 1.3 Release
- Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.
2019-03-20 10:49:17 -07:00
akerr
822b0952cd Resolved issue for incorrect SGEMM on Maxwell architecture. 2018-12-19 15:07:16 -08:00
akerr
74df0331f2 CUTLASS 1.2 2018-10-26 14:38:46 -07:00
akerr
cfe4b933ef CUDA 9 lacks host-side conversions from float=>half. Instead, we must reinterpret_cast<> from cutlass::half_t => half. 2018-09-29 15:04:20 -07:00
akerr
461f417b9d Checkpointing CUTLASS 1.1 release. 2018-09-18 16:58:03 -07:00