Andrew Kerr
|
877bdcace6
|
Cutlass 1.3 Release (#42)
CUTLASS 1.3 Release
- Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.
|
2019-03-20 10:49:17 -07:00 |
|
akerr
|
74df0331f2
|
CUTLASS 1.2
|
2018-10-26 14:38:46 -07:00 |
|
akerr
|
0826572c4c
|
Reduced range of random values to avoid bit-level inconsistencies for large matrices.
|
2018-09-19 21:11:48 -07:00 |
|
akerr
|
77d1e0ca81
|
Updated README and CHANGELOG.
|
2018-09-19 20:42:51 -07:00 |
|
akerr
|
461f417b9d
|
Checkpointing CUTLASS 1.1 release.
|
2018-09-18 16:58:03 -07:00 |
|
akerr
|
374882be53
|
Replaced GoogleTest copy with submodule. Added updates to support intra-threadblock reductions. Added tests for same.
|
2018-06-11 11:47:15 -07:00 |
|
akerr
|
480732c2e8
|
Minor updates to usage and readme.
|
2018-05-17 15:10:55 -07:00 |
|
akerr
|
acb90e962a
|
Updated url to Doxygen and modified usage statement in performance test program.
|
2018-05-17 11:11:05 -07:00 |
|
akerr
|
2028ebe120
|
CUTLASS v1.0 release
|
2018-05-16 11:44:56 -07:00 |
|