Commit Graph

35 Commits

Author SHA1 Message Date
Andrew Kerr
96dab34ad9
CUTLASS 2.1 (#83)
CUTLASS 2.1 contributes:
- BLAS-style host-side API added to CUTLASS Library
- Planar Complex GEMM kernels targeting Volta and Turing Tensor Cores
- Minor enhancements and bug fixes
2020-04-07 13:51:25 -07:00
Andrew Kerr
8aca98f9a7
Improved formatting, clarity, and content of several documents. (#64)
* Improved formatting, clarity, and content of several documents.
2019-11-20 10:42:15 -08:00
Andrew Kerr
fb335f6a5f
CUTLASS 2.0 (#62)
CUTLASS 2.0

Substantially refactored for

- Better performance, particularly for native Turing Tensor Cores
- Robust and durable templates spanning the design space
- Encapsulated functionality embodying modern C++11 programming techniques
- Optimized containers and data types for efficient, generic, portable device code

Updates to:
- Quick start guide
- Documentation
- Utilities
- CUTLASS Profiler

Native Turing Tensor Cores
- Efficient GEMM kernels targeting Turing Tensor Cores
- Mixed-precision floating point, 8-bit integer, 4-bit integer, and binarized operands

Coverage of existing CUTLASS functionality:
- GEMM kernels targeting CUDA and Tensor Cores in NVIDIA GPUs
- Volta Tensor Cores through native mma.sync and through WMMA API
- Optimizations such as parallel reductions, threadblock rasterization, and intra-threadblock reductions
- Batched GEMM operations
- Complex-valued GEMMs

Note: this commit and all that follow require a host compiler supporting C++11 or greater.
2019-11-19 16:55:34 -08:00
Andrew Kerr
b5cab177a9
Performance enhancement for Volta Tensor Cores TN layout (#53)
* Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.

* Updated patch version and changelog.

* Updated patch version and changelog.

* Added link to changelog in readme.

* Fixed markdown link
2019-07-10 10:54:12 -07:00
Timmy
fe3438a3c1 cutlass 1.3.1 (#46)
CUTLASS 1.3.1 patch resolves failing text with NVRTC.
2019-04-19 16:54:52 -07:00
Andrew Kerr
877bdcace6
Cutlass 1.3 Release (#42)
CUTLASS 1.3 Release
- Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.
2019-03-20 10:49:17 -07:00
Andrew Kerr
19a9d64e3c
Removed patch version from README.
Removed patch version from README.
2018-12-19 15:20:43 -08:00
akerr
74df0331f2 CUTLASS 1.2 2018-10-26 14:38:46 -07:00
Andrew Kerr
69e3709da4
Fixed typeo
Fixed typeo
2018-09-28 12:59:20 -07:00
akerr
1a7ac522f8 Clarification to README 2018-09-20 11:04:03 -07:00
akerr
77d1e0ca81 Updated README and CHANGELOG. 2018-09-19 20:42:51 -07:00
akerr
461f417b9d Checkpointing CUTLASS 1.1 release. 2018-09-18 16:58:03 -07:00
akerr
b9bb0d1a49 Edits to README and changelog pursuant CUTLASS 1.0.1 patch. 2018-06-26 13:57:39 -07:00
akerr
480732c2e8 Minor updates to usage and readme. 2018-05-17 15:10:55 -07:00
akerr
acb90e962a Updated url to Doxygen and modified usage statement in performance test program. 2018-05-17 11:11:05 -07:00
akerr
923dfb42ce Updated README.md 2018-05-16 12:50:10 -07:00
akerr
6f6f269a0a Updated README.md 2018-05-16 12:47:07 -07:00
akerr
2028ebe120 CUTLASS v1.0 release 2018-05-16 11:44:56 -07:00
Duane Merrill
95b0578d34 Update license info 2017-12-06 10:00:59 -05:00
Duane Merrill
f4b48c7669
Update README.md 2017-12-05 22:58:46 -05:00
Duane Merrill
6cb88d53eb
Update README.md 2017-12-05 22:58:12 -05:00
Duane Merrill
537a4bcedf
Update README.md 2017-12-05 22:54:49 -05:00
Duane Merrill
5bd3f09312
Update README.md 2017-12-05 22:53:11 -05:00
Duane Merrill
6f091f5620
Update README.md 2017-12-05 22:44:01 -05:00
dumerrill
0428c89fd5 Updating readme with relative per chart 2017-12-05 22:40:47 -05:00
Duane Merrill
e2bf51c3fe
Update README.md 2017-12-05 22:25:42 -05:00
Duane Merrill
57747e382e
Update README.md 2017-12-05 21:32:06 -05:00
Duane Merrill
dd4dd4cebf
Update README.md 2017-12-05 20:58:01 -05:00
Duane Merrill
6565b48747
Update README.md 2017-12-05 20:56:49 -05:00
Duane Merrill
73211bbb88
Update README.md 2017-12-05 20:55:54 -05:00
Duane Merrill
9dcb2b4c7d
Update README.md 2017-12-05 20:55:03 -05:00
Duane Merrill
f30abfc00a
Update README.md 2017-12-05 20:50:15 -05:00
dumerrill
8ebd6b06d0 Replace svg with png+text 2017-12-05 20:20:25 -05:00
dumerrill
04ffa156e8 Adding figure to readme.md 2017-12-05 20:15:33 -05:00
akerr
bbb3178126 Initial commit 2017-12-04 08:07:48 -08:00