Commit Graph

67 Commits

Author SHA1 Message Date
ANIKET SHIVAM
751eb9a885
Update license year (#1306) 2024-01-16 14:37:22 -05:00
ANIKET SHIVAM
2f589ffa76
Updates for 3.4 release. (#1305) 2024-01-16 13:42:51 -05:00
Pradeep Ramani
8236f30675
CUTLASS 3.4.0 (#1286)
* CUTLASS 3.4.0

* Update CHANGELOG.md

---------

Co-authored-by: Pradeep Ramani <prramani@nvidia.com>
2023-12-29 15:21:31 -05:00
Valeriy Fedyunin
f188f9b709
Fix typo in quickstart.md (#1257) 2023-12-07 09:49:52 -05:00
wang-y-z
1d7f2a207e
Fix several broken links (#1168)
Co-authored-by: isaacw <isaacw@nvidia.com>
2023-11-03 00:01:25 -04:00
wang-y-z
557be3ab0e
Fix several typos (#1169)
Co-authored-by: isaacw <isaacw@nvidia.com>
2023-11-02 23:54:46 -04:00
Pradeep Ramani
c008b4aea8
CUTLASS 3.3.0 (#1167)
* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
2023-11-02 11:09:05 -04:00
milesvant
fb10fa5308
Fix broken pipeline link in docs (#1143) 2023-10-18 12:55:46 -04:00
ANIKET SHIVAM
90d3b0fb18
CUTLASS 3.2.1 (#1113)
* Updates for 3.2.1 release.

* Minor fix in gemm op profiler for raster order.

* Add scheduler mapping for raster order in the kernels.
2023-09-26 17:24:26 -04:00
lorenzo chelini
3930f709ce
Fix typo in 0x_gemm_tutorial.md (#1035) 2023-08-17 10:52:20 -04:00
ANIKET SHIVAM
4575443d44
CUTLASS 3.2 (#1024)
* CUTLASS 3.2
2023-08-07 20:50:32 -04:00
Nathan Wang
9b923dd4c4
fix minor typos (#984) 2023-07-05 09:23:01 -04:00
Vijay Thakkar
fde824af21
Update Hopper performance plot for CUTLASS 3.1 + CTK 12.1 (#967) 2023-06-01 14:52:40 -04:00
ANIKET SHIVAM
f079619f5e
More updates for 3.1 (#958)
* Updates for 3.1

* Minor change

* doc link fix

* Minor updates
2023-05-24 10:17:16 -04:00
Haicheng Wu
6fbc0d3380
Update layout.md 2023-05-17 20:12:58 -04:00
Haicheng Wu
e2953d47c5
Update gemm_api.md 2023-05-12 15:37:31 -04:00
ANIKET SHIVAM
7c04f95415
Updates for 3.1 (#932) 2023-04-29 09:34:27 -04:00
Adnan Akhundov
54bebe417d
Fix some typos in CuTe tutorials (#912) 2023-04-17 16:00:51 -04:00
ANIKET SHIVAM
d572cc1aab
CUTLASS 3.1 (#915)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-04-14 23:19:34 -04:00
Adnios
0964bdb64c
update gemm and conv2d cmdline --help output (#878) 2023-04-01 11:38:13 -04:00
Alexander Pivovarov
7e370c9637
Fix typos 2 (#842)
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>
2023-03-09 23:22:56 -05:00
ANIKET SHIVAM
c4f6b8c6bc
Updates for 3.0 (#857)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-03-09 15:27:40 -05:00
ZZK
a101ac283f
Fix some typos (#791)
* fix typo

* fix a deadlink to code
2023-02-16 15:56:55 -05:00
Vijay Thakkar
277bd6e537
CUTLASS 3.0.0 (#786)
* CUTLASS 3.0.0
2023-01-23 20:55:28 -05:00
ANIKET SHIVAM
66d9cddc83
New updates for 2.11 (#775)
* New updates.

* Minor profiler updates

Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-01-20 16:32:57 -05:00
tpoisonooo
8567b87d65
Update quickstart.md (#704)
* Update quickstart.md

* Update doxygen_mainpage.md

* Update doxygen_mainpage.md

* Update terminology.md
2022-11-29 21:43:03 -05:00
Aditya Atluri
c975e2ccbb
releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
FZC
cc85b64cf6
fix typo (#677) 2022-11-01 14:07:33 -04:00
ANIKET SHIVAM
b72cbf957d
CUTLASS 2.10 (#615)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2022-09-03 18:48:46 -04:00
Cliff Burdick
536b20763e
Fixed typo in profiler README (#603) 2022-08-24 21:55:13 -04:00
Haicheng Wu
94f01f19d5
Add implicit gemm perf
plot from @manishucsd, presented in gtc'22 cutlass talk
2022-06-23 22:47:11 -04:00
Haicheng Wu
d6f58b2d14
Update functionality.md 2022-05-11 09:34:24 -04:00
Haicheng Wu
57551902d0
Update functionality.md
add some explanations to the functionality table.
2022-05-11 00:01:19 -04:00
Masahiro Masuda
70f3ba57f5
Fix typo in shared memory layout description (#471) 2022-04-24 18:32:13 -04:00
Andrew Kerr
12f4108ac2
CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
Feng Shijie
cd39c75e25
Fix typo in docs, code comments (#429)
* [docs] fix typo in media/docs/layout.md

* [docs] fix comment error

* fix typo in include/cutlass/arch/simd_61.h

* fix stride comment errors in TensorLayout
2022-03-15 21:54:36 -04:00
Ivan Komarov
e96f00586c
Make cutlass::gemm::device::GemmArray usable (#295)
* Fix the build of cutlass/gemm/device/gemm_array.h and add a demo for GemmArray

* Add a reference to GemmArray to the docs

Co-authored-by: Ivan Komarov <dfyz@yandex-team.ru>
2022-02-17 20:01:05 -05:00
Andrew Kerr
5fe09c2d67
Updated GEMM performance plot with CUTLASS 2.8 compiled with CUDA 11.5 Toolkit (#375)
Updated GEMM performance plot with CUTLASS 2.8 compiled using CUDA 11.5 Toolkit.

GPUs under test:

    NVIDIA A100
    NVIDIA A2
    NVIDIA TitanV
    NVIDIA GeForce 2080 Ti
2021-12-06 14:21:33 -05:00
Manish Gupta
808c25337a
CUTLASS 2.8 (#363)
CUTLASS 2.8
2021-11-19 13:26:35 -08:00
Haicheng Wu
6fc5008803
Update quickstart.md
fix a broken link
2021-11-11 09:53:46 -05:00
Manish Gupta
6c2f8f2fb8
CUTLASS 2.6.1 - functional and performance enhancements to strided DGRAD, fixes, and tuning
* cutlass 2.6 update

* remove debug prints

* cutlass 2.6.1 (minor update)

* Updated CHANGELOG.

* Minor edit to readme to indicate patch version.

* Minor edit to readme.

Co-authored-by:  Haicheng Wu <haichengw@nvidia.com>, Andrew Kerr <akerr@nvidia.com>
2021-09-03 10:26:15 -07:00
dongxiao
d36f331b44
fix typo in doc
fix typo
2021-08-08 16:44:22 +08:00
Haicheng Wu
10709dbb64 clean profiler cmd and doc 2021-07-30 11:02:17 -07:00
Peter Han
64dd1e1915 Doc typo
Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
2021-07-29 08:45:59 +08:00
Manish Gupta
1ac4559d12
Cutlass 2.6 Update 1 (#301)
* cutlass 2.6 update

* remove debug prints
2021-07-27 17:58:30 -07:00
Manish Gupta
e5d51840e8
CUTLASS 2.6 (#298)
CUTLASS 2.6
2021-07-23 00:40:53 -04:00
Zheng Zeng
b878c96421
Fixes some typos in utilities.md 2021-05-06 22:37:37 +08:00
Andrew Kerr
0e13748649 CUTLASS 2.5 2021-02-26 09:58:26 -05:00
Manish Gupta
ccb697bac7 cutlass 2.4 documentation only update 2020-11-23 06:59:45 -06:00
Yang Wang
e6bcdc60cf
fix broken links (#148) 2020-11-19 21:46:54 -08:00