* Support parallel split K mode for porfiling Signed-off-by: Peter Han <fujun.han@iluvatar.ai> * Parallel Split K support 1. find gemm kernel by preference key 2. switch m n for redution kernel Signed-off-by: Peter Han <fujun.han@iluvatar.ai> * parallel splitk for fp16 gemm * add one missing file Co-authored-by: Haicheng Wu <haichengw@nvidia.com> |
||
|---|---|---|
| .. | ||
| reduction | ||
| reference | ||
| conv2d_operation.h | ||
| conv3d_operation.h | ||
| gemm_operation.h | ||
| handle.cu | ||
| library_internal.h | ||
| manifest.cpp | ||
| operation_table.cu | ||
| singleton.cu | ||
| util.cu | ||