* ensure singleton::get thread safe construct instance * fix singleton return reference Co-authored-by: xuweiqi <xuweiqi117@gmail.com>
* cutlass 2.6 update * remove debug prints
CUTLASS 2.1 contributes: - BLAS-style host-side API added to CUTLASS Library - Planar Complex GEMM kernels targeting Volta and Turing Tensor Cores - Minor enhancements and bug fixes