cutlass/tools
Manish Gupta 757275f279
Adding more Threadblock Tiles for Mixed-input TensorOp (BF16 * S8) in cutlass_library (#1132)
* Adding more tiles in the cutlass_library for mixed-input support.

* fix rebase issue

* more tiles to upcast a
2023-10-13 11:33:15 -04:00
..
library Adding more Threadblock Tiles for Mixed-input TensorOp (BF16 * S8) in cutlass_library (#1132) 2023-10-13 11:33:15 -04:00
profiler Fix Parallel Split-K on Gemm Operation Profiler (#1109) 2023-09-26 17:28:00 -04:00
util Allow changing epsilon parameter in RMS norm kernel (#1112) 2023-10-02 20:40:28 -04:00
CMakeLists.txt CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00