* Removed trivial copy constructors on parameter classes to enable device-side launch of CUTLASS kernels * Added SFINAE to the `TensorRef(NonConstTensorRef const&)` constructor to avoid making it a copy-constructor for device code * std => platform * fix affine2 * really fix affine2 Co-authored-by: Haicheng Wu <haichengw@nvidia.com> |
||
|---|---|---|
| .. | ||
| ampere_fp64_tensorop_affine2_gemm.cu | ||
| CMakeLists.txt | ||