Commit Graph

8 Commits

Author SHA1 Message Date
Andrew Kerr
b5cab177a9
Performance enhancement for Volta Tensor Cores TN layout (#53)
* Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.

* Updated patch version and changelog.

* Updated patch version and changelog.

* Added link to changelog in readme.

* Fixed markdown link
2019-07-10 10:54:12 -07:00
Timmy
fe3438a3c1 cutlass 1.3.1 (#46)
CUTLASS 1.3.1 patch resolves failing text with NVRTC.
2019-04-19 16:54:52 -07:00
Andrew Kerr
877bdcace6
Cutlass 1.3 Release (#42)
CUTLASS 1.3 Release
- Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.
2019-03-20 10:49:17 -07:00
akerr
74df0331f2 CUTLASS 1.2 2018-10-26 14:38:46 -07:00
akerr
77d1e0ca81 Updated README and CHANGELOG. 2018-09-19 20:42:51 -07:00
akerr
461f417b9d Checkpointing CUTLASS 1.1 release. 2018-09-18 16:58:03 -07:00
akerr
374882be53 Replaced GoogleTest copy with submodule. Added updates to support intra-threadblock reductions. Added tests for same. 2018-06-11 11:47:15 -07:00
akerr
2028ebe120 CUTLASS v1.0 release 2018-05-16 11:44:56 -07:00