Commit Graph

7 Commits

Author SHA1 Message Date
Sean Xiaowen Zhang
21d0534167
fix assertion (#1790) 2024-09-09 14:05:27 -04:00
Vijay Thakkar
629f4653c3
CUTLASS 3.5.0 (#1411) 2024-03-19 17:51:04 -04:00
ANIKET SHIVAM
751eb9a885
Update license year (#1306) 2024-01-16 14:37:22 -05:00
reed
eb01d5449d
fix cp.async L2 prefetch typo (#1187) 2023-11-28 16:58:04 -05:00
reed
6e60b9b17c
enable L2::128B prefetch for cp.async by default (#1177) 2023-11-13 13:30:13 -05:00
Pradeep Ramani
c008b4aea8
CUTLASS 3.3.0 (#1167)
* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
2023-11-02 11:09:05 -04:00
Vijay Thakkar
277bd6e537
CUTLASS 3.0.0 (#786)
* CUTLASS 3.0.0
2023-01-23 20:55:28 -05:00