Vijay Thakkar 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							be60a0b272 
							
						 
					 
					
						
						
							
							CUTLASS 3.5.1 ( #1623 )  
						
						... 
						
						
						
						* CUTLASS 3.5.1
* updates, optimizations, fixes 
						
					 
					
						2024-07-29 08:46:24 -04:00 
						 
				 
			
				
					
						
							
							
								Vijay Thakkar 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							629f4653c3 
							
						 
					 
					
						
						
							
							CUTLASS 3.5.0 ( #1411 )  
						
						
						
					 
					
						2024-03-19 17:51:04 -04:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							751eb9a885 
							
						 
					 
					
						
						
							
							Update license year ( #1306 )  
						
						
						
					 
					
						2024-01-16 14:37:22 -05:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2f589ffa76 
							
						 
					 
					
						
						
							
							Updates for 3.4 release. ( #1305 )  
						
						
						
					 
					
						2024-01-16 13:42:51 -05:00 
						 
				 
			
				
					
						
							
							
								Christian Sigg 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							56fc3df03b 
							
						 
					 
					
						
						
							
							Adding missing typename ( #1191 )  
						
						... 
						
						
						
						Fixes clang build failures. 
						
					 
					
						2023-11-29 00:20:20 -05:00 
						 
				 
			
				
					
						
							
							
								dan_the_3rd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							146d314057 
							
						 
					 
					
						
						
							
							Update fMHA kernels ( #992 )  
						
						... 
						
						
						
						* Update fMHA kernels
Upstream recent changes to fMHA that we did in xFormers.
Previous version in CUTLASS: facebookresearch/xformers@b6be33a 
Updating to: facebookresearch/xformers@55a4798 
* minor changes
* make var work
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com> 
						
					 
					
						2023-07-12 22:30:46 -04:00 
						 
				 
			
				
					
						
							
							
								Alexander Zinoviev 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e36912f961 
							
						 
					 
					
						
						
							
							Fix for dangling references in the MHA example ( #918 )  
						
						
						
					 
					
						2023-04-19 21:35:46 -04:00 
						 
				 
			
				
					
						
							
							
								dan_the_3rd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9b8166e3f0 
							
						 
					 
					
						
						
							
							fMHA: Add backward pass ( #844 )  
						
						... 
						
						
						
						* fMHA: Add backward pass
* Better checks for strides/alignments
* Remove fb-internal URL
* torch.Tensor.untyped_storage requires pytorch 2.0+
* minor changes
* make test
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com> 
						
					 
					
						2023-04-06 20:44:58 -04:00 
						 
				 
			
				
					
						
							
							
								Alexander Pivovarov 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7e370c9637 
							
						 
					 
					
						
						
							
							Fix typos 2 ( #842 )  
						
						... 
						
						
						
						Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com> 
						
					 
					
						2023-03-09 23:22:56 -05:00 
						 
				 
			
				
					
						
							
							
								dan_the_3rd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f303889ed9 
							
						 
					 
					
						
						
							
							fMHA: Sync FW with xFormers ( #828 )  
						
						... 
						
						
						
						* fMHA: Add support for bias+dropout in FW
* Remove 'getMaximumSharedMemoryPerBlockKb'
* fix comments
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com> 
						
					 
					
						2023-02-22 23:25:31 -05:00 
						 
				 
			
				
					
						
							
							
								dan_the_3rd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2e10404d26 
							
						 
					 
					
						
						
							
							xFormer updates to fMHA FW ( #773 )  
						
						... 
						
						
						
						* xFormer updates to fMHA FW
* Convert format to BMHK for '41_fused_multi_head_attention_fixed_seqlen'
* Add missing files
* Remove xFormers specific code
* Update fused_multihead_attention_fixed_seqlen.cu
* rebase and solve conflicts
* remove white space
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com> 
						
					 
					
						2023-02-08 23:00:10 -05:00 
						 
				 
			
				
					
						
							
							
								Vijay Thakkar 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							277bd6e537 
							
						 
					 
					
						
						
							
							CUTLASS 3.0.0 ( #786 )  
						
						... 
						
						
						
						* CUTLASS 3.0.0 
						
					 
					
						2023-01-23 20:55:28 -05:00 
						 
				 
			
				
					
						
							
							
								ANIKET SHIVAM 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							66d9cddc83 
							
						 
					 
					
						
						
							
							New updates for 2.11 ( #775 )  
						
						... 
						
						
						
						* New updates.
* Minor profiler updates
Co-authored-by: Aniket Shivam <ashivam@nvidia.com> 
						
					 
					
						2023-01-20 16:32:57 -05:00 
						 
				 
			
				
					
						
							
							
								Haicheng Wu 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3f2bb17722 
							
						 
					 
					
						
						
							
							minor chagnes ( #730 )  
						
						... 
						
						
						
						Co-authored-by: Haicheng Wu <haichengw@nvidia.com> 
						
					 
					
						2022-12-10 14:44:53 -05:00 
						 
				 
			
				
					
						
							
							
								Aditya Atluri 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c975e2ccbb 
							
						 
					 
					
						
						
							
							releaase 2.11 ( #703 )  
						
						
						
					 
					
						2022-11-19 09:02:15 -05:00