wang-y-z 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1d7f2a207e 
							
						 
					 
					
						
						
							
							Fix several broken links ( #1168 )  
						
						... 
						
						
						
						Co-authored-by: isaacw <isaacw@nvidia.com> 
						
					 
					
						2023-11-03 00:01:25 -04:00 
						 
				 
			
				
					
						
							
							
								wang-y-z 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							557be3ab0e 
							
						 
					 
					
						
						
							
							Fix several typos ( #1169 )  
						
						... 
						
						
						
						Co-authored-by: isaacw <isaacw@nvidia.com> 
						
					 
					
						2023-11-02 23:54:46 -04:00 
						 
				 
			
				
					
						
							
							
								Pradeep Ramani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c008b4aea8 
							
						 
					 
					
						
						
							
							CUTLASS 3.3.0 ( #1167 )  
						
						... 
						
						
						
						* Release 3.3.0
Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.
* minor doc update 
						
					 
					
						2023-11-02 11:09:05 -04:00 
						 
				 
			
				
					
						
							
							
								milesvant 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fb10fa5308 
							
						 
					 
					
						
						
							
							Fix broken pipeline link in docs ( #1143 )  
						
						
						
					 
					
						2023-10-18 12:55:46 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							90d3b0fb18 
							
						 
					 
					
						
						
							
							CUTLASS 3.2.1 ( #1113 )  
						
						... 
						
						
						
						* Updates for 3.2.1 release.
* Minor fix in gemm op profiler for raster order.
* Add scheduler mapping for raster order in the kernels. 
						
					 
					
						2023-09-26 17:24:26 -04:00 
						 
				 
			
				
					
						
							
							
								lorenzo chelini 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3930f709ce 
							
						 
					 
					
						
						
							
							Fix typo in 0x_gemm_tutorial.md ( #1035 )  
						
						
						
					 
					
						2023-08-17 10:52:20 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4575443d44 
							
						 
					 
					
						
						
							
							CUTLASS 3.2 ( #1024 )  
						
						... 
						
						
						
						* CUTLASS 3.2 
						
					 
					
						2023-08-07 20:50:32 -04:00 
						 
				 
			
				
					
						
							
							
								Nathan Wang 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9b923dd4c4 
							
						 
					 
					
						
						
							
							fix minor typos ( #984 )  
						
						
						
					 
					
						2023-07-05 09:23:01 -04:00 
						 
				 
			
				
					
						
							
							
								Vijay Thakkar 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fde824af21 
							
						 
					 
					
						
						
							
							Update Hopper performance plot for CUTLASS 3.1 + CTK 12.1 ( #967 )  
						
						
						
					 
					
						2023-06-01 14:52:40 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f079619f5e 
							
						 
					 
					
						
						
							
							More updates for 3.1 ( #958 )  
						
						... 
						
						
						
						* Updates for 3.1
* Minor change
* doc link fix
* Minor updates 
						
					 
					
						2023-05-24 10:17:16 -04:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6fbc0d3380 
							
						 
					 
					
						
						
							
							Update layout.md  
						
						
						
					 
					
						2023-05-17 20:12:58 -04:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e2953d47c5 
							
						 
					 
					
						
						
							
							Update gemm_api.md  
						
						
						
					 
					
						2023-05-12 15:37:31 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7c04f95415 
							
						 
					 
					
						
						
							
							Updates for 3.1 ( #932 )  
						
						
						
					 
					
						2023-04-29 09:34:27 -04:00 
						 
				 
			
				
					
						
							
							
								Adnan Akhundov 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							54bebe417d 
							
						 
					 
					
						
						
							
							Fix some typos in CuTe tutorials ( #912 )  
						
						
						
					 
					
						2023-04-17 16:00:51 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d572cc1aab 
							
						 
					 
					
						
						
							
							CUTLASS 3.1 ( #915 )  
						
						... 
						
						
						
						Co-authored-by: Aniket Shivam <ashivam@nvidia.com> 
						
					 
					
						2023-04-14 23:19:34 -04:00 
						 
				 
			
				
					
						
							
							
								Adnios 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0964bdb64c 
							
						 
					 
					
						
						
							
							update gemm and conv2d cmdline --help output ( #878 )  
						
						
						
					 
					
						2023-04-01 11:38:13 -04:00 
						 
				 
			
				
					
						
							
							
								Alexander Pivovarov 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7e370c9637 
							
						 
					 
					
						
						
							
							Fix typos 2 ( #842 )  
						
						... 
						
						
						
						Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com> 
						
					 
					
						2023-03-09 23:22:56 -05:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c4f6b8c6bc 
							
						 
					 
					
						
						
							
							Updates for 3.0 ( #857 )  
						
						... 
						
						
						
						Co-authored-by: Aniket Shivam <ashivam@nvidia.com> 
						
					 
					
						2023-03-09 15:27:40 -05:00 
						 
				 
			
				
					
						
							
							
								ZZK 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a101ac283f 
							
						 
					 
					
						
						
							
							Fix some typos ( #791 )  
						
						... 
						
						
						
						* fix typo
* fix a deadlink to code 
						
					 
					
						2023-02-16 15:56:55 -05:00 
						 
				 
			
				
					
						
							
							
								Vijay Thakkar 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							277bd6e537 
							
						 
					 
					
						
						
							
							CUTLASS 3.0.0 ( #786 )  
						
						... 
						
						
						
						* CUTLASS 3.0.0 
						
					 
					
						2023-01-23 20:55:28 -05:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							66d9cddc83 
							
						 
					 
					
						
						
							
							New updates for 2.11 ( #775 )  
						
						... 
						
						
						
						* New updates.
* Minor profiler updates
Co-authored-by: Aniket Shivam <ashivam@nvidia.com> 
						
					 
					
						2023-01-20 16:32:57 -05:00 
						 
				 
			
				
					
						
							
							
								tpoisonooo 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8567b87d65 
							
						 
					 
					
						
						
							
							Update quickstart.md ( #704 )  
						
						... 
						
						
						
						* Update quickstart.md
* Update doxygen_mainpage.md
* Update doxygen_mainpage.md
* Update terminology.md 
						
					 
					
						2022-11-29 21:43:03 -05:00 
						 
				 
			
				
					
						
							
							
								Aditya Atluri 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c975e2ccbb 
							
						 
					 
					
						
						
							
							releaase 2.11 ( #703 )  
						
						
						
					 
					
						2022-11-19 09:02:15 -05:00 
						 
				 
			
				
					
						
							
							
								FZC 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cc85b64cf6 
							
						 
					 
					
						
						
							
							fix typo ( #677 )  
						
						
						
					 
					
						2022-11-01 14:07:33 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b72cbf957d 
							
						 
					 
					
						
						
							
							CUTLASS 2.10 ( #615 )  
						
						... 
						
						
						
						Co-authored-by: Aniket Shivam <ashivam@nvidia.com> 
						
					 
					
						2022-09-03 18:48:46 -04:00 
						 
				 
			
				
					
						
							
							
								Cliff Burdick 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							536b20763e 
							
						 
					 
					
						
						
							
							Fixed typo in profiler README ( #603 )  
						
						
						
					 
					
						2022-08-24 21:55:13 -04:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							94f01f19d5 
							
						 
					 
					
						
						
							
							Add implicit gemm perf  
						
						... 
						
						
						
						plot from @manishucsd, presented in gtc'22 cutlass talk 
						
					 
					
						2022-06-23 22:47:11 -04:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d6f58b2d14 
							
						 
					 
					
						
						
							
							Update functionality.md  
						
						
						
					 
					
						2022-05-11 09:34:24 -04:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							57551902d0 
							
						 
					 
					
						
						
							
							Update functionality.md  
						
						... 
						
						
						
						add some explanations to the functionality table. 
						
					 
					
						2022-05-11 00:01:19 -04:00 
						 
				 
			
				
					
						
							
							
								Masahiro Masuda 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							70f3ba57f5 
							
						 
					 
					
						
						
							
							Fix typo in shared memory layout description ( #471 )  
						
						
						
					 
					
						2022-04-24 18:32:13 -04:00 
						 
				 
			
				
					
						
							
							
								Andrew Kerr 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							12f4108ac2 
							
						 
					 
					
						
						
							
							CUTLASS 2.9 ( #468 )  
						
						
						
					 
					
						2022-04-23 15:02:38 -04:00 
						 
				 
			
				
					
						
							
							
								Feng Shijie 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cd39c75e25 
							
						 
					 
					
						
						
							
							Fix typo in docs, code comments ( #429 )  
						
						... 
						
						
						
						* [docs] fix typo in media/docs/layout.md
* [docs] fix comment error
* fix typo in include/cutlass/arch/simd_61.h
* fix stride comment errors in TensorLayout 
						
					 
					
						2022-03-15 21:54:36 -04:00 
						 
				 
			
				
					
						
							
							
								Ivan Komarov 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e96f00586c 
							
						 
					 
					
						
						
							
							Make cutlass::gemm::device::GemmArray usable ( #295 )  
						
						... 
						
						
						
						* Fix the build of cutlass/gemm/device/gemm_array.h and add a demo for GemmArray
* Add a reference to GemmArray to the docs
Co-authored-by: Ivan Komarov <dfyz@yandex-team.ru> 
						
					 
					
						2022-02-17 20:01:05 -05:00 
						 
				 
			
				
					
						
							
							
								Andrew Kerr 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5fe09c2d67 
							
						 
					 
					
						
						
							
							Updated GEMM performance plot with CUTLASS 2.8 compiled with CUDA 11.5 Toolkit ( #375 )  
						
						... 
						
						
						
						Updated GEMM performance plot with CUTLASS 2.8 compiled using CUDA 11.5 Toolkit.
GPUs under test:
    NVIDIA A100
    NVIDIA A2
    NVIDIA TitanV
    NVIDIA GeForce 2080 Ti 
						
					 
					
						2021-12-06 14:21:33 -05:00 
						 
				 
			
				
					
						
							
							
								Manish Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							808c25337a 
							
						 
					 
					
						
						
							
							CUTLASS 2.8 ( #363 )  
						
						... 
						
						
						
						CUTLASS 2.8 
						
					 
					
						2021-11-19 13:26:35 -08:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6fc5008803 
							
						 
					 
					
						
						
							
							Update quickstart.md  
						
						... 
						
						
						
						fix a broken link 
						
					 
					
						2021-11-11 09:53:46 -05:00 
						 
				 
			
				
					
						
							
							
								Manish Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6c2f8f2fb8 
							
						 
					 
					
						
						
							
							CUTLASS 2.6.1 - functional and performance enhancements to strided DGRAD, fixes, and tuning  
						
						... 
						
						
						
						* cutlass 2.6 update
* remove debug prints
* cutlass 2.6.1 (minor update)
* Updated CHANGELOG.
* Minor edit to readme to indicate patch version.
* Minor edit to readme.
Co-authored-by:  Haicheng Wu <haichengw@nvidia.com>, Andrew Kerr <akerr@nvidia.com> 
						
					 
					
						2021-09-03 10:26:15 -07:00 
						 
				 
			
				
					
						
							
							
								dongxiao 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d36f331b44 
							
						 
					 
					
						
						
							
							fix typo in doc  
						
						... 
						
						
						
						fix typo 
						
					 
					
						2021-08-08 16:44:22 +08:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
						
						
							
						
						
							10709dbb64 
							
						 
					 
					
						
						
							
							clean profiler cmd and doc  
						
						
						
					 
					
						2021-07-30 11:02:17 -07:00 
						 
				 
			
				
					
						
							
							
								Peter Han 
							
						 
					 
					
						
						
						
						
							
						
						
							64dd1e1915 
							
						 
					 
					
						
						
							
							Doc typo  
						
						... 
						
						
						
						Signed-off-by: Peter Han <fujun.han@iluvatar.ai> 
						
					 
					
						2021-07-29 08:45:59 +08:00 
						 
				 
			
				
					
						
							
							
								Manish Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1ac4559d12 
							
						 
					 
					
						
						
							
							Cutlass 2.6 Update 1 ( #301 )  
						
						... 
						
						
						
						* cutlass 2.6 update
* remove debug prints 
						
					 
					
						2021-07-27 17:58:30 -07:00 
						 
				 
			
				
					
						
							
							
								Manish Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e5d51840e8 
							
						 
					 
					
						
						
							
							CUTLASS 2.6 ( #298 )  
						
						... 
						
						
						
						CUTLASS 2.6 
						
					 
					
						2021-07-23 00:40:53 -04:00 
						 
				 
			
				
					
						
							
							
								Zheng Zeng 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b878c96421 
							
						 
					 
					
						
						
							
							Fixes some typos in utilities.md  
						
						
						
					 
					
						2021-05-06 22:37:37 +08:00 
						 
				 
			
				
					
						
							
							
								Andrew Kerr 
							
						 
					 
					
						
						
						
						
							
						
						
							0e13748649 
							
						 
					 
					
						
						
							
							CUTLASS 2.5  
						
						
						
					 
					
						2021-02-26 09:58:26 -05:00 
						 
				 
			
				
					
						
							
							
								Manish Gupta 
							
						 
					 
					
						
						
						
						
							
						
						
							ccb697bac7 
							
						 
					 
					
						
						
							
							cutlass 2.4 documentation only update  
						
						
						
					 
					
						2020-11-23 06:59:45 -06:00 
						 
				 
			
				
					
						
							
							
								Yang Wang 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e6bcdc60cf 
							
						 
					 
					
						
						
							
							fix broken links ( #148 )  
						
						
						
					 
					
						2020-11-19 21:46:54 -08:00 
						 
				 
			
				
					
						
							
							
								Manish Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6615010cd0 
							
						 
					 
					
						
						
							
							CUTLASS 2.4 (Implicit GEMM convolution) ( #147 )  
						
						... 
						
						
						
						CUTLASS 2.4 (Implicit GEMM Convolution)
Co-authored-by: Manish Gupta <manigupta@nvidia.com>, Haicheng Wu <haichengw@nvidia.com>, Dustyn Blasig <dblasig@nvidia.com>, Andrew Kerr <akerr@nvidia.com> 
						
					 
					
						2020-11-19 21:25:25 -08:00 
						 
				 
			
				
					
						
							
							
								Andrew Kerr 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c53f3339bb 
							
						 
					 
					
						
						
							
							CUTLASS 2.3 initial commit ( #134 )  
						
						... 
						
						
						
						CUTLASS 2.3 adds GEMMs targeting Sparse Tensor Cores on the NVIDIA Ampere Architecture, fast SGEMM, and small matrix classes, bug fixes, and performance enhancements. 
						
					 
					
						2020-09-23 14:00:58 -07:00 
						 
				 
			
				
					
						
							
							
								Andrew Kerr 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fd7e058d0c 
							
						 
					 
					
						
						
							
							Added examples to enable the unity build ( #102 )  
						
						... 
						
						
						
						* Updated documentation of fused GEMM example and removed UNITY BUILD batch size. The default batch size when unity build is enabled tends to be favorable. 
						
					 
					
						2020-06-17 07:09:18 -07:00 
						 
				 
			
				
					
						
							
							
								Andrew Kerr 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1ab1027954 
							
						 
					 
					
						
						
							
							Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. ( #100 )  
						
						... 
						
						
						
						- Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>.
- Enhancement to CUTLASS Utility Library's HostTensorPlanarComplex template to support copy-in and copy-out
- Added test_examples target to build and test all CUTLASS examples
- Minor edits to documentation to point to GTC 2020 webinar 
						
					 
					
						2020-06-15 10:47:01 -07:00