Commit Graph

16 Commits

Author SHA1 Message Date
Vijay Thakkar
7d49e6c7e2
Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
Vijay Thakkar
629f4653c3
CUTLASS 3.5.0 (#1411) 2024-03-19 17:51:04 -04:00
ANIKET SHIVAM
751eb9a885
Update license year (#1306) 2024-01-16 14:37:22 -05:00
ANIKET SHIVAM
2f589ffa76
Updates for 3.4 release. (#1305) 2024-01-16 13:42:51 -05:00
ANIKET SHIVAM
4575443d44
CUTLASS 3.2 (#1024)
* CUTLASS 3.2
2023-08-07 20:50:32 -04:00
Jack Kosaian
7dbf423763
Add conversion from ElementBias to ElementCompute (#961) 2023-05-26 23:08:36 -04:00
ANIKET SHIVAM
f079619f5e
More updates for 3.1 (#958)
* Updates for 3.1

* Minor change

* doc link fix

* Minor updates
2023-05-24 10:17:16 -04:00
ANIKET SHIVAM
d572cc1aab
CUTLASS 3.1 (#915)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-04-14 23:19:34 -04:00
Edward Rees
86cae03cea
expose StoreT parameter for potential speed (#838)
* expose StoreT parameter for potential speed

* add storeT to more elementwise

---------

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-03-10 12:58:17 -05:00
Shuai Shao
ce8597dc14
Fix type bug in conv2d/gemm with broadcast (#796)
add ElementVector

---------

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-02-09 20:53:25 -05:00
ANIKET SHIVAM
66d9cddc83
New updates for 2.11 (#775)
* New updates.

* Minor profiler updates

Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-01-20 16:32:57 -05:00
Aditya Atluri
c975e2ccbb
releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
Andrew Kerr
12f4108ac2
CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
Andrew Kerr
ec4f7e5194
Updates to fused epilogue (#383)
* Enhancements and fixes to fused GEMM and Convolution epilogue.
* Need to explicitly list cudart as unit test library dependency.
2021-12-17 16:04:43 -05:00
Manish Gupta
1ac4559d12
Cutlass 2.6 Update 1 (#301)
* cutlass 2.6 update

* remove debug prints
2021-07-27 17:58:30 -07:00
Manish Gupta
e5d51840e8
CUTLASS 2.6 (#298)
CUTLASS 2.6
2021-07-23 00:40:53 -04:00