CUTLASS 2.3 adds GEMMs targeting Sparse Tensor Cores on the NVIDIA Ampere Architecture, fast SGEMM, and small matrix classes, bug fixes, and performance enhancements.
Adds support for NVIDIA Ampere Architecture features. CUDA 11 Toolkit recommended.