cutlass/tools/library/scripts
Vijay Thakkar e01b9b5029
Shard gemm reference templates into multiple TUs for parallel compilation (#1043)
* Split apart gemm reference templates into multiple TUs for parallel compilation

* remove old files

* better balancing of ref kernels across TUs

* remove 3 new added refcheck kernels and some un-necessary fp8 library instances to reduce lib size

* remove auto fp8 kernels

* remove some redundant kernels
2023-08-30 16:46:30 -04:00
..
__init__.py CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
conv2d_operation.py releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
conv3d_operation.py CUTLASS 2.4 (Implicit GEMM convolution) (#147) 2020-11-19 21:25:25 -08:00
gemm_operation.py Add simple hash and eq methods for gemm_operations. (#1053) 2023-08-27 20:41:57 -04:00
generator.py Shard gemm reference templates into multiple TUs for parallel compilation (#1043) 2023-08-30 16:46:30 -04:00
library.py CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
manifest.py CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
rank_2k_operation.py CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
rank_k_operation.py CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
rt.py New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
symm_operation.py CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
trmm_operation.py CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00