Merge pull request #15 from NVIDIA/release_1.0.1_edits

Minor edits to README and changelog pursuant CUTLASS 1.0.1 patch.
This commit is contained in:
Andrew Kerr 2018-06-26 13:59:01 -07:00 committed by GitHub
commit cf0301e00f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 3 additions and 1 deletions

View File

@ -2,6 +2,8 @@
# CUTLASS 1.0
_CUTLASS 1.0.1 - June 2018_
CUTLASS 1.0 is a collection of CUDA C++ template abstractions for implementing
high-performance matrix-multiplication (GEMM) at all levels and scales within CUDA.
It incorporates strategies for hierarchical decomposition and data movement similar

View File

@ -1,6 +1,6 @@
# NVIDIA CUTLASS Changelog
## 1.0.1 (2018-06-11)
## [1.0.1](https://github.com/NVIDIA/cutlass/releases/tag/v1.0.1) (2018-06-11)
* Intra-threadblock reduction added for small threadblock tile sizes
* sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16