Merge pull request #15 from NVIDIA/release_1.0.1_edits
Minor edits to README and changelog pursuant CUTLASS 1.0.1 patch.
This commit is contained in:
commit
cf0301e00f
@ -2,6 +2,8 @@
|
||||
|
||||
# CUTLASS 1.0
|
||||
|
||||
_CUTLASS 1.0.1 - June 2018_
|
||||
|
||||
CUTLASS 1.0 is a collection of CUDA C++ template abstractions for implementing
|
||||
high-performance matrix-multiplication (GEMM) at all levels and scales within CUDA.
|
||||
It incorporates strategies for hierarchical decomposition and data movement similar
|
||||
|
@ -1,6 +1,6 @@
|
||||
# NVIDIA CUTLASS Changelog
|
||||
|
||||
## 1.0.1 (2018-06-11)
|
||||
## [1.0.1](https://github.com/NVIDIA/cutlass/releases/tag/v1.0.1) (2018-06-11)
|
||||
|
||||
* Intra-threadblock reduction added for small threadblock tile sizes
|
||||
* sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16
|
||||
|
Loading…
Reference in New Issue
Block a user