Commit Graph

62 Commits

Author SHA1 Message Date
Andrew Kerr
96dab34ad9
CUTLASS 2.1 (#83)
CUTLASS 2.1 contributes:
- BLAS-style host-side API added to CUTLASS Library
- Planar Complex GEMM kernels targeting Volta and Turing Tensor Cores
- Minor enhancements and bug fixes
2020-04-07 13:51:25 -07:00
Andrew Kerr
7c0cd26d13
Need Python 3.6 to use enum.auto() (#70) 2019-11-22 09:39:12 -08:00
Andrew Kerr
8aca98f9a7
Improved formatting, clarity, and content of several documents. (#64)
* Improved formatting, clarity, and content of several documents.
2019-11-20 10:42:15 -08:00
Andrew Kerr
fb335f6a5f
CUTLASS 2.0 (#62)
CUTLASS 2.0

Substantially refactored for

- Better performance, particularly for native Turing Tensor Cores
- Robust and durable templates spanning the design space
- Encapsulated functionality embodying modern C++11 programming techniques
- Optimized containers and data types for efficient, generic, portable device code

Updates to:
- Quick start guide
- Documentation
- Utilities
- CUTLASS Profiler

Native Turing Tensor Cores
- Efficient GEMM kernels targeting Turing Tensor Cores
- Mixed-precision floating point, 8-bit integer, 4-bit integer, and binarized operands

Coverage of existing CUTLASS functionality:
- GEMM kernels targeting CUDA and Tensor Cores in NVIDIA GPUs
- Volta Tensor Cores through native mma.sync and through WMMA API
- Optimizations such as parallel reductions, threadblock rasterization, and intra-threadblock reductions
- Batched GEMM operations
- Complex-valued GEMMs

Note: this commit and all that follow require a host compiler supporting C++11 or greater.
2019-11-19 16:55:34 -08:00
Andrew Kerr
877bdcace6
Cutlass 1.3 Release (#42)
CUTLASS 1.3 Release
- Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.
2019-03-20 10:49:17 -07:00
akerr
74df0331f2 CUTLASS 1.2 2018-10-26 14:38:46 -07:00
akerr
77d1e0ca81 Updated README and CHANGELOG. 2018-09-19 20:42:51 -07:00
akerr
461f417b9d Checkpointing CUTLASS 1.1 release. 2018-09-18 16:58:03 -07:00
akerr
2028ebe120 CUTLASS v1.0 release 2018-05-16 11:44:56 -07:00
dumerrill
0428c89fd5 Updating readme with relative per chart 2017-12-05 22:40:47 -05:00
dumerrill
8ebd6b06d0 Replace svg with png+text 2017-12-05 20:20:25 -05:00
dumerrill
04ffa156e8 Adding figure to readme.md 2017-12-05 20:15:33 -05:00