Commit Graph

3 Commits

Author SHA1 Message Date
Andrew Kerr
c53f3339bb
CUTLASS 2.3 initial commit (#134)
CUTLASS 2.3 adds GEMMs targeting Sparse Tensor Cores on the NVIDIA Ampere Architecture, fast SGEMM, and small matrix classes, bug fixes, and performance enhancements.
2020-09-23 14:00:58 -07:00
Andrew Kerr
fd7e058d0c
Added examples to enable the unity build (#102)
* Updated documentation of fused GEMM example and removed UNITY BUILD batch size. The default batch size when unity build is enabled tends to be favorable.
2020-06-17 07:09:18 -07:00
Andrew Kerr
86931fef85
CUTLASS 2.2 (#96)
Adds support for NVIDIA Ampere Architecture features. CUDA 11 Toolkit recommended.
2020-06-08 16:17:35 -07:00