* Updated documentation of fused GEMM example and removed UNITY BUILD batch size. The default batch size when unity build is enabled tends to be favorable.
Adds support for NVIDIA Ampere Architecture features. CUDA 11 Toolkit recommended.