| 
							
							
								 Manish Gupta | 1227351079 | Merge pull request #305 from NVIDIA/fix_epilogue_spill fix epilogue register spill | 2021-07-29 14:30:11 -07:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | a77c658439 | fix epilogue register spill | 2021-07-29 14:25:48 -07:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 4516b833ce | Merge pull request #303 from Peter9606/doc_typo Doc typo | 2021-07-28 20:49:06 -04:00 |  | 
			
				
					| 
							
							
								 Peter Han | 64dd1e1915 | Doc typo Signed-off-by: Peter Han <fujun.han@iluvatar.ai> | 2021-07-29 08:45:59 +08:00 |  | 
			
				
					| 
							
							
								 Manish Gupta | 1ac4559d12 | Cutlass 2.6 Update 1 (#301) * cutlass 2.6 update
* remove debug prints | 2021-07-27 17:58:30 -07:00 |  | 
			
				
					| 
							
							
								 Manish Gupta | e5d51840e8 | CUTLASS 2.6 (#298) CUTLASS 2.6 | 2021-07-23 00:40:53 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 6c29fe20ba | Merge pull request #285 from tjingrant/patch-1 Typo Fixes | 2021-07-05 22:51:19 -04:00 |  | 
			
				
					| 
							
							
								 Tian Jin | e3c56b0d6b | Update predicated_tile_iterator.h | 2021-07-05 12:11:53 -04:00 |  | 
			
				
					| 
							
							
								 Tian Jin | 4647c57243 | Update predicated_tile_iterator.h | 2021-07-05 12:06:41 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 856d4db3fb | Update basic_gemm.cu fix the matrix malloc size | 2021-06-15 09:08:36 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 6a1064093f | Merge pull request #274 from mani-ananth/master Some pending Bug fixes | 2021-06-02 13:17:39 -04:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | c5f1ef4dff | update contributors | 2021-06-02 10:11:42 -07:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | 47ebfccbec | bug fixes | 2021-06-02 10:08:25 -07:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | ad9486684f | Merge pull request #272 from BernardoCovas/master Bug in reference conv3d | 2021-05-28 17:18:27 -04:00 |  | 
			
				
					| 
							
							
								 Bernardo Covas | 1d8372a8e2 | fix typo in reference conv3d | 2021-05-28 21:06:59 +01:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 9cb7d63424 | Merge pull request #266 from mani-ananth/master Fixes for public issue #265 | 2021-05-19 15:15:22 -04:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | da2f110906 | Fixes for public issue #265 | 2021-05-19 10:16:52 -07:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | b68113f5be | Merge pull request #264 from zheng95z/patch-3 Adds `NoBetaScaling` for `LinearCombination` | 2021-05-17 10:03:30 -04:00 |  | 
			
				
					| 
							
							
								 Zheng Zeng | a68d7cd6f1 | Adds NoBetaScalingforLinearCombination | 2021-05-12 22:23:55 +08:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 38e8b29f56 | Merge pull request #259 from hzfan/ignore_pr Add gitignore | 2021-05-10 20:07:53 -04:00 |  | 
			
				
					| 
							
							
								 Haozheng Fan | ee7349c94f | fix | 2021-05-10 16:39:04 +08:00 |  | 
			
				
					| 
							
							
								 Haozheng Fan | 8cdd4293d4 | add gitignore | 2021-05-10 16:37:59 +08:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | f58b843951 | Merge pull request #239 from KeDengMS/kedeng/gelu Fixes to Gelu for half and fusion | 2021-05-08 12:51:42 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 5fc142296f | Merge pull request #237 from Peter9606/issue_236_typo Typo fix issue#236 | 2021-05-08 07:51:19 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 233d69aa6d | Merge pull request #235 from Peter9606/issue_233_tranpose_update tranpose.h update based on issue#233 | 2021-05-07 07:14:30 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 9840d25269 | Merge pull request #256 from zheng95z/patch-2 Fixes some typos in utilities.md | 2021-05-06 11:02:49 -04:00 |  | 
			
				
					| 
							
							
								 Zheng Zeng | b878c96421 | Fixes some typos in utilities.md | 2021-05-06 22:37:37 +08:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 8f8a80cad5 | Merge pull request #251 from zheng95z/patch-1 add a missing 'device_memory::' before a function | 2021-04-25 22:09:44 -04:00 |  | 
			
				
					| 
							
							
								 Zheng Zeng | a8f6f8eb07 | add a missing 'device_memory::' before a function | 2021-04-25 20:05:39 +08:00 |  | 
			
				
					| 
							
							
								 mengchi.hmc | f4b0a33633 | add unit test for non int4 load | 2021-04-23 14:33:46 +08:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | 7c783adf53 | Merge pull request #247 from xue-fc/patch-1 fix a wrong description | 2021-04-22 09:27:40 -04:00 |  | 
			
				
					| 
							
							
								 xue-fc | 4000df9567 | fix a wrong description | 2021-04-22 20:28:28 +08:00 |  | 
			
				
					| 
							
							
								 mengchi.hmc | bb35a3ba6f | support setting load granularity for conv2d fprop | 2021-04-22 15:20:57 +08:00 |  | 
			
				
					| 
							
							
								 mengchi.hmc | 7ec3a87f22 | support unalignment input for conv2d fprop stage=2 Fix for issue #242 | 2021-04-21 14:40:05 +08:00 |  | 
			
				
					| 
							
							
								 KeDengMS | 0b74c8f473 | Address CR | 2021-04-19 23:36:06 +00:00 |  | 
			
				
					| 
							
							
								 KeDengMS | 83036ed646 | More clean up | 2021-04-18 04:29:20 +00:00 |  | 
			
				
					| 
							
							
								 KeDengMS | b7e43f5eb9 | Clean up | 2021-04-18 04:24:25 +00:00 |  | 
			
				
					| 
							
							
								 KeDengMS | 5c62d892fa | Add test | 2021-04-18 04:09:34 +00:00 |  | 
			
				
					| 
							
							
								 KeDengMS | 41a31b404b | Fixes to Gelu for half and fusion | 2021-04-17 22:10:19 +00:00 |  | 
			
				
					| 
							
							
								 Peter Han | 7320aee17d | Typo fix issue#236 Signed-off-by: Peter Han <fujun.han@iluvatar.ai> | 2021-04-15 15:08:35 +08:00 |  | 
			
				
					| 
							
							
								 Peter Han | 2142a05d9d | tranpose.h update based on issue#233 1. Add 'pragma once' preprocess directive
 2. Replace prmt PTX with __byte_perm intrinsic
Signed-off-by: Peter Han <fujun.han@iluvatar.ai> | 2021-04-14 19:58:00 +08:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | c77a524459 | Merge pull request #230 from mani-ananth/master Fix for issue #221 | 2021-04-09 14:45:55 -04:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | fac6680f31 | Merge branch 'master' of github.com:NVIDIA/cutlass | 2021-04-09 11:36:31 -07:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | 08993707da | fixing functional bug in fused epilogue | 2021-04-09 11:36:03 -07:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | c805593ebe | Merge pull request #228 from mani-ananth/master Fix for issue#224 and issue#225 | 2021-04-08 10:08:13 -04:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | 26556d7206 | fix a broken sparse gemm example.  found by the community. | 2021-04-07 13:32:55 -07:00 |  | 
			
				
					| 
							
							
								 Manikandan Ananth | 4839b6cb61 | add 2stage fprop 3d into default file | 2021-04-07 13:29:32 -07:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | d97214987a | Merge pull request #220 from Peter9606/wrong-stride-array-definition Bugfix: typo, make reduction device cases passed | 2021-04-02 08:43:52 -04:00 |  | 
			
				
					| 
							
							
								 Haicheng Wu | b0bbc6d548 | Merge pull request #219 from mani-ananth/master Fix for issue #211 | 2021-04-02 08:42:09 -04:00 |  | 
			
				
					| 
							
							
								 Peter Han | 7074047a54 | Bugfix: typo, make reduction device cases passed Signed-off-by: Peter Han <fujun.han@iluvatar.ai> | 2021-04-02 09:35:23 +08:00 |  |