Vijay Thakkar
be60a0b272
CUTLASS 3.5.1 ( #1623 )
...
* CUTLASS 3.5.1
* updates, optimizations, fixes
2024-07-29 08:46:24 -04:00
Vijay Thakkar
7d49e6c7e2
Updates for CUTLASS 3.5.0 ( #1468 )
2024-04-11 21:33:40 -04:00
Driss Guessous
47a3ebbea9
Add a missing platform include ( #1328 )
2024-02-03 01:30:32 -05:00
ANIKET SHIVAM
751eb9a885
Update license year ( #1306 )
2024-01-16 14:37:22 -05:00
Pradeep Ramani
8236f30675
CUTLASS 3.4.0 ( #1286 )
...
* CUTLASS 3.4.0
* Update CHANGELOG.md
---------
Co-authored-by: Pradeep Ramani <prramani@nvidia.com>
2023-12-29 15:21:31 -05:00
Changho Hwang
1ab6cc7b68
Fix std::abs
overloading for bfloat16_t
( #1179 )
2023-11-13 13:29:45 -05:00
ANIKET SHIVAM
90d3b0fb18
CUTLASS 3.2.1 ( #1113 )
...
* Updates for 3.2.1 release.
* Minor fix in gemm op profiler for raster order.
* Add scheduler mapping for raster order in the kernels.
2023-09-26 17:24:26 -04:00
ANIKET SHIVAM
66d9cddc83
New updates for 2.11 ( #775 )
...
* New updates.
* Minor profiler updates
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-01-20 16:32:57 -05:00
Aditya Atluri
c975e2ccbb
releaase 2.11 ( #703 )
2022-11-19 09:02:15 -05:00
Andrew Kerr
12f4108ac2
CUTLASS 2.9 ( #468 )
2022-04-23 15:02:38 -04:00
Manish Gupta
1ac4559d12
Cutlass 2.6 Update 1 ( #301 )
...
* cutlass 2.6 update
* remove debug prints
2021-07-27 17:58:30 -07:00
Manish Gupta
e5d51840e8
CUTLASS 2.6 ( #298 )
...
CUTLASS 2.6
2021-07-23 00:40:53 -04:00
Andrew Kerr
0e13748649
CUTLASS 2.5
2021-02-26 09:58:26 -05:00
Andrew Kerr
c53f3339bb
CUTLASS 2.3 initial commit ( #134 )
...
CUTLASS 2.3 adds GEMMs targeting Sparse Tensor Cores on the NVIDIA Ampere Architecture, fast SGEMM, and small matrix classes, bug fixes, and performance enhancements.
2020-09-23 14:00:58 -07:00
Andrew Kerr
86931fef85
CUTLASS 2.2 ( #96 )
...
Adds support for NVIDIA Ampere Architecture features. CUDA 11 Toolkit recommended.
2020-06-08 16:17:35 -07:00