* Add couple configs into generator.py for mixed input MM * change one unit test name; reenable 128x32 in the profiler * Added U8/BF16 tests. --------- Co-authored-by: Haicheng Wu <haichengw@nvidia.com> Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| device | ||
| kernel | ||
| thread | ||
| threadblock | ||
| warp | ||
| CMakeLists.txt | ||