cutlass/test/unit/core
Manish Gupta 7d8317a63e
Support for Mixed Input TensorOp (#1084)
* Passing warp-level mixed input F16*(S8/U8) tests

* passing device-level mixed input F16*(S8/U8) tests

* add to profiler - I8 (111 TFLOPs), U (123 TFLOPs)

* fast numeric conversions (I8 = 132 TFLOPs, U8 = 148 TFLOPs)

* Speedup reference compilation (REVERT THIS COMMIT)

* wider_add.u32_packed_sub.f16x2 (I8 = 132TFLOP/s, U8 = 170 TFLOP/s)

* Improve s8->f16 cvt and support bf16*u8 @158 TFLOPs

* BF16 * S8 (142 TFLOPs)

* Handle mixed-input upcast on OperandA (Support [S8|U8]*[F16|BF16]

* rename OpMultiplyAddMixedInput to OpMultiplyAddMixedInputUpcast

* Add device-level test and profiler support for upcast on operand A

* Move shfl before the cvt and reduce #shfls by 1/2

* fix smem_usage calculation for mixed_input types

* uncomment the stuff (getting ready for merge)

* profiler changes and mixed-input reference

* mixed input reference are in a new file

* use platform instead of std

* comments and typo only

* Use CreateGemmOperator and delete CreateMixedInputGemmOperator

* copyright for new files

* rebase follow-up
2023-09-27 11:18:30 -04:00
..
array.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
bfloat16.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
CMakeLists.txt Support for Mixed Input TensorOp (#1084) 2023-09-27 11:18:30 -04:00
complex.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
cpp11.cu CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
fast_numeric_conversion.cu Support for Mixed Input TensorOp (#1084) 2023-09-27 11:18:30 -04:00
float8.cu CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
functional.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
half.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
matrix_coord.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
matrix.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
numeric_conversion.cu Updates for 3.2 release (#1065) 2023-08-25 23:05:46 -04:00
predicate_vector.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
quaternion.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
tensor_ref.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
tensor_view.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
test_unit_core.cpp New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
tfloat32.cu New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00