From 57551902d00cbb2134d4a4ae72cc2241ae538524 Mon Sep 17 00:00:00 2001
From: Haicheng Wu <57973641+hwu36@users.noreply.github.com>
Date: Wed, 11 May 2022 00:01:19 -0400
Subject: [PATCH] Update functionality.md

add some explanations to the functionality table.
---
 media/docs/functionality.md | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/media/docs/functionality.md b/media/docs/functionality.md
index b85b3a39..caa1139b 100644
--- a/media/docs/functionality.md
+++ b/media/docs/functionality.md
@@ -9,6 +9,23 @@
 The following table summarizes device-level GEMM kernels in CUTLASS, organized by opcode class, data type, and layout.
 Hyperlinks to relevant unit tests demonstrate how specific template instances may be defined.
 
+- N - Column Major Matrix
+- T - Row Major matrix
+- {N,T} x {N,T} - All combinations, i.e. NN, NT, TN, TT
+- [NHWC](/include/cutlass/layout/tensor.h#L63-206) - 4 dimension tensor used for convolution
+- [NCxHWx](/include/cutlass/layout/tensor.h#L290-395) - Interleaved 4 dimension tensor used for convolution
+- f - float point
+- s - signed int
+- b - bit
+- cf - complex float
+- bf16 - bfloat16
+- tf32 - tfloat32
+- Simt - Use Simt CUDA Core MMA
+- TensorOp - Use Tensor Core MMA
+- SpTensorOp - Use Sparse Tensor Core MMA
+- WmmaTensorOp - Use WMMA abstraction to use Tensor Core MMA
+
+
 |**Opcode Class** | **Compute Capability** | **CUDA Toolkit** | **Data Type**                  | **Layouts**            | **Unit Test**    |
 |-----------------|------------------------|------------------|--------------------------------|------------------------|------------------|
 | **Simt**        | 50,60,61,70,75         |  9.2+            | `f32 * f32 + f32 => f32`       | {N,T} x {N,T} => {N,T} |  [example](/test/unit/gemm/device/simt_sgemm_nt_sm50.cu)                |