diff --git a/README.md b/README.md index 1f1bef8e..3fab8fe2 100644 --- a/README.md +++ b/README.md @@ -51,7 +51,7 @@ CUTLASS 2.8 is an update to CUTLASS adding: # Performance -

+

CUTLASS primitives are very efficient. When used to construct device-wide GEMM kernels, they exhibit performance comparable to cuBLAS for scalar GEMM diff --git a/media/images/cutlass-2.8-gemm-performance.png b/media/images/cutlass-2.8-gemm-performance.png new file mode 100644 index 00000000..6fb63578 Binary files /dev/null and b/media/images/cutlass-2.8-gemm-performance.png differ