cutlass/docs/search/all_17.js

var searchData=
[
  ['xor_5fadd',['xor_add',['../structcutlass_1_1xor__add.html',1,'cutlass']]]
];
CUTLASS v1.0 release 2018-05-17 02:44:56 +08:00			`var searchData=`
			`[`
CUTLASS 2.0 (#62) CUTLASS 2.0 Substantially refactored for - Better performance, particularly for native Turing Tensor Cores - Robust and durable templates spanning the design space - Encapsulated functionality embodying modern C++11 programming techniques - Optimized containers and data types for efficient, generic, portable device code Updates to: - Quick start guide - Documentation - Utilities - CUTLASS Profiler Native Turing Tensor Cores - Efficient GEMM kernels targeting Turing Tensor Cores - Mixed-precision floating point, 8-bit integer, 4-bit integer, and binarized operands Coverage of existing CUTLASS functionality: - GEMM kernels targeting CUDA and Tensor Cores in NVIDIA GPUs - Volta Tensor Cores through native mma.sync and through WMMA API - Optimizations such as parallel reductions, threadblock rasterization, and intra-threadblock reductions - Batched GEMM operations - Complex-valued GEMMs Note: this commit and all that follow require a host compiler supporting C++11 or greater. 2019-11-20 08:55:34 +08:00			`['xor_5fadd',['xor_add',['../structcutlass_1_1xor__add.html',1,'cutlass']]]`
CUTLASS v1.0 release 2018-05-17 02:44:56 +08:00			`];`