cutlass/examples/python
Pradeep Ramani c008b4aea8
CUTLASS 3.3.0 (#1167)
* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
2023-11-02 11:09:05 -04:00
..
00_basic_gemm.ipynb CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
01_epilogue.ipynb CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
02_pytorch_extension_grouped_gemm.ipynb CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
03_basic_conv2d.ipynb CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
04_epilogue_visitor.ipynb CUTLASS 3.3.0 (#1167) 2023-11-02 11:09:05 -04:00
README.md CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00

Examples of using the CUTLASS Python interface

  • 00_basic_gemm

    Shows how declare, configure, compile, and run a CUTLASS GEMM using the Python interface

  • 01_epilogue

    Shows how to fuse elementwise activation functions to GEMMs via the Python interface

  • 02_pytorch_extension_grouped_gemm

    Shows how to declare, compile, and run a grouped GEMM operation via the Python interface, along with how the emitted kernel can be easily exported to a PyTorch CUDA extension.

  • 03_basic_conv2d

    Shows how to declare, configure, compile, and run a CUTLASS Conv2d using the Python interface

  • 04_epilogue_visitor

    Shows how to fuse elementwise activation functions to GEMMs via the Python Epilogue Visitor interface