Edits to README and changelog pursuant CUTLASS 1.0.1 patch.
This commit is contained in:
parent
e1c4ba501b
commit
b9bb0d1a49
@ -2,6 +2,8 @@
|
|||||||
|
|
||||||
# CUTLASS 1.0
|
# CUTLASS 1.0
|
||||||
|
|
||||||
|
_CUTLASS 1.0.1 - June 2018_
|
||||||
|
|
||||||
CUTLASS 1.0 is a collection of CUDA C++ template abstractions for implementing
|
CUTLASS 1.0 is a collection of CUDA C++ template abstractions for implementing
|
||||||
high-performance matrix-multiplication (GEMM) at all levels and scales within CUDA.
|
high-performance matrix-multiplication (GEMM) at all levels and scales within CUDA.
|
||||||
It incorporates strategies for hierarchical decomposition and data movement similar
|
It incorporates strategies for hierarchical decomposition and data movement similar
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# NVIDIA CUTLASS Changelog
|
# NVIDIA CUTLASS Changelog
|
||||||
|
|
||||||
## 1.0.1 (2018-06-11)
|
## [1.0.1](https://github.com/NVIDIA/cutlass/releases/tag/v1.0.1) (2018-06-11)
|
||||||
|
|
||||||
* Intra-threadblock reduction added for small threadblock tile sizes
|
* Intra-threadblock reduction added for small threadblock tile sizes
|
||||||
* sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16
|
* sgemm_64x128x16, sgemm_128x128x16, sgemm_128x64x16, sgemm_128x32x16, sgemm_64x64x16, sgemm_64x32x16
|
||||||
|
Loading…
Reference in New Issue
Block a user