Merge pull request #15 from NVIDIA/release_1.0.1_edits

Minor edits to README and changelog pursuant CUTLASS 1.0.1 patch.
This commit is contained in:
Andrew Kerr 2018-06-26 13:59:01 -07:00 committed by GitHub
commit cf0301e00f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 3 additions and 1 deletions

View File

@ -2,6 +2,8 @@
# CUTLASS 1.0 # CUTLASS 1.0
_CUTLASS 1.0.1 - June 2018_
CUTLASS 1.0 is a collection of CUDA C++ template abstractions for implementing CUTLASS 1.0 is a collection of CUDA C++ template abstractions for implementing
high-performance matrix-multiplication (GEMM) at all levels and scales within CUDA. high-performance matrix-multiplication (GEMM) at all levels and scales within CUDA.
It incorporates strategies for hierarchical decomposition and data movement similar It incorporates strategies for hierarchical decomposition and data movement similar

View File

@ -1,6 +1,6 @@
# NVIDIA CUTLASS Changelog # NVIDIA CUTLASS Changelog
## 1.0.1 (2018-06-11) ## [1.0.1](https://github.com/NVIDIA/cutlass/releases/tag/v1.0.1) (2018-06-11)
* Intra-threadblock reduction added for small threadblock tile sizes * Intra-threadblock reduction added for small threadblock tile sizes
* sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16 * sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16