![]() * Removed trivial copy constructors on parameter classes to enable device-side launch of CUTLASS kernels * Added SFINAE to the `TensorRef(NonConstTensorRef const&)` constructor to avoid making it a copy-constructor for device code * std => platform * fix affine2 * really fix affine2 Co-authored-by: Haicheng Wu <haichengw@nvidia.com> |
||
---|---|---|
.. | ||
ampere_fp64_tensorop_affine2_gemm.cu | ||
CMakeLists.txt |