cutlass/tools/library
Fujun Han 1e4703cbab
Support parallel split K mode for porfiling (#277)
* Support parallel split K mode for porfiling

Signed-off-by: Peter Han <fujun.han@iluvatar.ai>

* Parallel Split K support

  1. find gemm kernel by preference key
  2. switch m n for redution kernel

Signed-off-by: Peter Han <fujun.han@iluvatar.ai>

* parallel splitk for fp16 gemm

* add one missing file

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2022-01-27 10:37:37 -05:00
..
include/cutlass/library Support parallel split K mode for porfiling (#277) 2022-01-27 10:37:37 -05:00
scripts Fix typo in scripts/library.py (wrong data size for u8) (#393) 2022-01-07 13:29:56 -05:00
src Support parallel split K mode for porfiling (#277) 2022-01-27 10:37:37 -05:00
CMakeLists.txt Cutlass 2.6 Update 1 (#301) 2021-07-27 17:58:30 -07:00