cutlass/CONTRIBUTORS.md
Andrew Kerr fb335f6a5f
CUTLASS 2.0 (#62)
CUTLASS 2.0

Substantially refactored for

- Better performance, particularly for native Turing Tensor Cores
- Robust and durable templates spanning the design space
- Encapsulated functionality embodying modern C++11 programming techniques
- Optimized containers and data types for efficient, generic, portable device code

Updates to:
- Quick start guide
- Documentation
- Utilities
- CUTLASS Profiler

Native Turing Tensor Cores
- Efficient GEMM kernels targeting Turing Tensor Cores
- Mixed-precision floating point, 8-bit integer, 4-bit integer, and binarized operands

Coverage of existing CUTLASS functionality:
- GEMM kernels targeting CUDA and Tensor Cores in NVIDIA GPUs
- Volta Tensor Cores through native mma.sync and through WMMA API
- Optimizations such as parallel reductions, threadblock rasterization, and intra-threadblock reductions
- Batched GEMM operations
- Complex-valued GEMMs

Note: this commit and all that follow require a host compiler supporting C++11 or greater.
2019-11-19 16:55:34 -08:00

57 lines
903 B
Markdown

![ALT](/media/images/gemm-hierarchy-with-epilogue-no-labels.png "CUTLASS")
[README](/README.md#documentation) > **Contributors**
# CUTLASS Developers and Contributors
This is the official list of CUTLASS developers and contributors.
## DEVELOPERS
Andrew Kerr
Haicheng Wu
Naila Farooqui
Dustyn Blasig
Pradeep Ramani
Manish Gupta
Aditya Atluri
Paul Springer
David Tanner
Scott Yokim
Jin Wang
## CONTRIBUTORS
Timothy Costa
Julien Demouth
Brian Fahs
Michael Goldfarb
Mostafa Hagog
Markus Hohnerbach
Fei Hu
Alan Kaatz
Tina Li
Timmy Liu
Duane Merrill
Kevin Siu
Markus Tavenrath
John Tran
Vicki Wang
Junkai Wu
Fung Xie
Albert Xu
Jack Yang
Xiuxia Zhang
Nick Zhao
## ACKNOWLEDGEMENTS
Girish Bharambe
Cris Cecka
Luke Durant
Olivier Giroux
Stephen Jones
Rishkul Kulkarni
Bryce Lelbach
Joel McCormack
Kyrylo Perelygin