cutlass/include/cute
Mark Hoemmen add4ba622f
Fix 8.4 + CUDA 11.4 build (#789)
Work around a likely GCC 8.x issue with fold expressions
and generic lambdas.

Only use the work-around when the host compiler is GCC 8.x.
This avoids any concerns about the work-around possibly
hindering inlining for a critical CuTe function (product).

Users can experiment with the work-around for other compilers
or compiler versions by defining the following macro.

CUTE_FOLD_GENERIC_LAMBDA_WORKAROUND

Fixes https://github.com/NVIDIA/cutlass/issues/788

Co-authored-by: Mark Hoemmen <mhoemmen@nvidia.com>
2023-01-27 09:18:59 -05:00
..
algorithm CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
arch CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
atom CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
container CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
numeric CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
util CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
config.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
int_tuple.hpp Fix 8.4 + CUDA 11.4 build (#789) 2023-01-27 09:18:59 -05:00
layout.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
pointer.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
stride.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
swizzle_layout.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
swizzle_ptr.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
swizzle.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
tensor_predicate.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
tensor.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
tile.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
underscore.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00