dan_the_3rd
|
9b8166e3f0
|
fMHA: Add backward pass (#844)
* fMHA: Add backward pass
* Better checks for strides/alignments
* Remove fb-internal URL
* torch.Tensor.untyped_storage requires pytorch 2.0+
* minor changes
* make test
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
|
2023-04-06 20:44:58 -04:00 |
|
Alexander Pivovarov
|
7e370c9637
|
Fix typos 2 (#842)
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>
|
2023-03-09 23:22:56 -05:00 |
|
dan_the_3rd
|
f303889ed9
|
fMHA: Sync FW with xFormers (#828)
* fMHA: Add support for bias+dropout in FW
* Remove 'getMaximumSharedMemoryPerBlockKb'
* fix comments
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
|
2023-02-22 23:25:31 -05:00 |
|
ANIKET SHIVAM
|
66d9cddc83
|
New updates for 2.11 (#775)
* New updates.
* Minor profiler updates
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
|
2023-01-20 16:32:57 -05:00 |
|
Aditya Atluri
|
c975e2ccbb
|
releaase 2.11 (#703)
|
2022-11-19 09:02:15 -05:00 |
|