cutlass/include/cutlass
Tri Dao 323c8170bf
Support ComputeFn where output type differs from input type (#1771)
This is useful for e.g. function taking in 2 float inputs and turn them to complex
2024-09-05 23:25:03 -04:00
..
arch CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
conv 1x1x1 cluster launch (#1673) 2024-08-01 12:20:28 -04:00
detail Fix isnan namespace qualification in cutlass/functional.h (#1679) 2024-08-05 14:28:13 -04:00
epilogue Support ComputeFn where output type differs from input type (#1771) 2024-09-05 23:25:03 -04:00
gemm Add support for mixed 4-bit/8-bit data types GEMM (#1413) 2024-08-29 23:11:06 -04:00
layout CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
pipeline CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
platform CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
reduction CUTLASS 3.5.0 (#1411) 2024-03-19 17:51:04 -04:00
thread Update license year (#1306) 2024-01-16 14:37:22 -05:00
transform CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
aligned_buffer.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
array_planar_complex.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
array_subbyte.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
array.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
barrier.h Update barrier.h (#1782) 2024-09-04 14:52:11 -04:00
bfloat16.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
blas3_types.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
blas3.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
block_striped.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
cluster_launch.hpp CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
complex.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
constants.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
coord.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
core_io.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
cuda_host_adapter.hpp Use CUDA runtime API to retrieve function pointer to driver API (#1700) 2024-08-19 13:26:09 -04:00
cutlass.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
device_kernel.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
fast_math.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
float8.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
floating_point_nvrtc.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
functional.h Fix isnan namespace qualification in cutlass/functional.h (#1679) 2024-08-05 14:28:13 -04:00
gemm_coord.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
gemm_coord.hpp Update license year (#1306) 2024-01-16 14:37:22 -05:00
half.h Update half.h (#1709) 2024-08-14 14:59:59 -04:00
integer_subbyte.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
kernel_hardware_info.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
kernel_hardware_info.hpp Update license year (#1306) 2024-01-16 14:37:22 -05:00
kernel_launch.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
matrix_coord.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
matrix_shape.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
matrix.h set_slice3x3 -> set_slice_3x3 (#1784) 2024-09-05 23:24:10 -04:00
numeric_conversion.h Add support for mixed 4-bit/8-bit data types GEMM (#1413) 2024-08-29 23:11:06 -04:00
numeric_size.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
numeric_types.h Updates for CUTLASS 3.5.0 (#1468) 2024-04-11 21:33:40 -04:00
pitch_linear_coord.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
predicate_vector.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
quaternion.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
real.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
relatively_equal.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
semaphore.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
subbyte_reference.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
tensor_coord.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
tensor_ref_planar_complex.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
tensor_ref.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
tensor_view_planar_complex.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
tensor_view.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
tfloat32.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
trace.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
uint128.h fix uint128 2024-08-15 21:06:01 -07:00
version.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00
wmma_array.h Update license year (#1306) 2024-01-16 14:37:22 -05:00
workspace.h CUTLASS 3.5.1 (#1623) 2024-07-29 08:46:24 -04:00