cutlass/cmake/version.h.in
Andrew Kerr fb335f6a5f
CUTLASS 2.0 (#62)
CUTLASS 2.0

Substantially refactored for

- Better performance, particularly for native Turing Tensor Cores
- Robust and durable templates spanning the design space
- Encapsulated functionality embodying modern C++11 programming techniques
- Optimized containers and data types for efficient, generic, portable device code

Updates to:
- Quick start guide
- Documentation
- Utilities
- CUTLASS Profiler

Native Turing Tensor Cores
- Efficient GEMM kernels targeting Turing Tensor Cores
- Mixed-precision floating point, 8-bit integer, 4-bit integer, and binarized operands

Coverage of existing CUTLASS functionality:
- GEMM kernels targeting CUDA and Tensor Cores in NVIDIA GPUs
- Volta Tensor Cores through native mma.sync and through WMMA API
- Optimizations such as parallel reductions, threadblock rasterization, and intra-threadblock reductions
- Batched GEMM operations
- Complex-valued GEMMs

Note: this commit and all that follow require a host compiler supporting C++11 or greater.
2019-11-19 16:55:34 -08:00

39 lines
975 B
C

#include <cstdint>
#include <string>
#define CUTLASS_MAJOR @CUTLASS_VERSION_MAJOR@
#define CUTLASS_MINOR @CUTLASS_VERSION_MINOR@
#define CUTLASS_PATCH @CUTLASS_VERSION_PATCH@
#define CUTLASS_BUILD @CUTLASS_VERSION_BUILD@
#define CUTLASS_VERSION ((CUTLASS_MAJOR)*100 + (CUTLASS_MINOR)*10 + CUTLASS_PATCH)
namespace cutlass {
inline uint32_t getVersion() {
return CUTLASS_VERSION;
}
inline uint32_t getVersionMajor() {
return CUTLASS_MAJOR;
}
inline uint32_t getVersionMinor() {
return CUTLASS_MINOR;
}
inline uint32_t getVersionPatch() {
return CUTLASS_PATCH;
}
inline uint32_t getVersionBuild() {
return CUTLASS_BUILD + 0;
}
inline std::string getVersionString() {
std::string version = "@CUTLASS_VERSION@";
if (getVersionBuild()) {
version += "." + std::to_string(getVersionBuild());
}
return version;
}
inline std::string getGitRevision() {
return "@CUTLASS_REVISION@";
}
} // namespace cutlass