cutlass/python/docs_src/source/install.md
Pradeep Ramani c008b4aea8
CUTLASS 3.3.0 (#1167)
* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
2023-11-02 11:09:05 -04:00

1.5 KiB

Installation

Installing from source

Installing from source requires the latest CUDA Toolkit that matches the major.minor of CUDA Python installed.

Prior to installing the CUTLASS Python interface, one may optionally set the following environment variables:

  • CUTLASS_PATH: the path to the cloned CUTLASS repository
  • CUDA_INSTALL_PATH: the path to the installation of CUDA

If these environment variables are not set, the installation process will infer them to be the following:

  • CUTLASS_PATH: either one directory level above the current directory (i.e., $(pwd)/..) if installed locally or in the source directory of the location in which cutlass_library was installed
  • CUDA_INSTALL_PATH: the directory holding /bin/nvcc for the first version of nvcc on $PATH (i.e., which nvcc | awk -F'/bin/nvcc' '{print $1}')

NOTE: The version of cuda-python installed must match the CUDA version in CUDA_INSTALL_PATH.

Installing a developer-mode package

The CUTLASS Python interface can currently be installed by navigating to the root of the CUTLASS directory and performing

pip install .

If you would like to be able to make changes to CULASS Python interface and have them reflected when using the interface, perform:

pip install -e .

Docker

We recommend using the CUTLASS Python interface via an NGC PyTorch Docker container:

docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.08-py3