Vijay Thakkar
be60a0b272
CUTLASS 3.5.1 ( #1623 )
...
* CUTLASS 3.5.1
* updates, optimizations, fixes
2024-07-29 08:46:24 -04:00
Vijay Thakkar
629f4653c3
CUTLASS 3.5.0 ( #1411 )
2024-03-19 17:51:04 -04:00
ANIKET SHIVAM
751eb9a885
Update license year ( #1306 )
2024-01-16 14:37:22 -05:00
ANIKET SHIVAM
2f589ffa76
Updates for 3.4 release. ( #1305 )
2024-01-16 13:42:51 -05:00
Christian Sigg
56fc3df03b
Adding missing typename
( #1191 )
...
Fixes clang build failures.
2023-11-29 00:20:20 -05:00
dan_the_3rd
146d314057
Update fMHA kernels ( #992 )
...
* Update fMHA kernels
Upstream recent changes to fMHA that we did in xFormers.
Previous version in CUTLASS: facebookresearch/xformers@b6be33a
Updating to: facebookresearch/xformers@55a4798
* minor changes
* make var work
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-07-12 22:30:46 -04:00
Alexander Zinoviev
e36912f961
Fix for dangling references in the MHA example ( #918 )
2023-04-19 21:35:46 -04:00
dan_the_3rd
9b8166e3f0
fMHA: Add backward pass ( #844 )
...
* fMHA: Add backward pass
* Better checks for strides/alignments
* Remove fb-internal URL
* torch.Tensor.untyped_storage requires pytorch 2.0+
* minor changes
* make test
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-04-06 20:44:58 -04:00
Alexander Pivovarov
7e370c9637
Fix typos 2 ( #842 )
...
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>
2023-03-09 23:22:56 -05:00
dan_the_3rd
f303889ed9
fMHA: Sync FW with xFormers ( #828 )
...
* fMHA: Add support for bias+dropout in FW
* Remove 'getMaximumSharedMemoryPerBlockKb'
* fix comments
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-02-22 23:25:31 -05:00
dan_the_3rd
2e10404d26
xFormer updates to fMHA FW ( #773 )
...
* xFormer updates to fMHA FW
* Convert format to BMHK for '41_fused_multi_head_attention_fixed_seqlen'
* Add missing files
* Remove xFormers specific code
* Update fused_multihead_attention_fixed_seqlen.cu
* rebase and solve conflicts
* remove white space
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-02-08 23:00:10 -05:00
Vijay Thakkar
277bd6e537
CUTLASS 3.0.0 ( #786 )
...
* CUTLASS 3.0.0
2023-01-23 20:55:28 -05:00
ANIKET SHIVAM
66d9cddc83
New updates for 2.11 ( #775 )
...
* New updates.
* Minor profiler updates
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-01-20 16:32:57 -05:00
Haicheng Wu
3f2bb17722
minor chagnes ( #730 )
...
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2022-12-10 14:44:53 -05:00
Aditya Atluri
c975e2ccbb
releaase 2.11 ( #703 )
2022-11-19 09:02:15 -05:00