cutlass

History

Mark Hoemmen add4ba622f Fix 8.4 + CUDA 11.4 build (#789 ) Work around a likely GCC 8.x issue with fold expressions and generic lambdas. Only use the work-around when the host compiler is GCC 8.x. This avoids any concerns about the work-around possibly hindering inlining for a critical CuTe function (product). Users can experiment with the work-around for other compilers or compiler versions by defining the following macro. CUTE_FOLD_GENERIC_LAMBDA_WORKAROUND Fixes https://github.com/NVIDIA/cutlass/issues/788 Co-authored-by: Mark Hoemmen <mhoemmen@nvidia.com>		2023-01-27 09:18:59 -05:00
..
algorithm	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
arch	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
atom	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
container	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
numeric	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
util	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
config.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
int_tuple.hpp	Fix 8.4 + CUDA 11.4 build (#789 )	2023-01-27 09:18:59 -05:00
layout.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
pointer.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
stride.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
swizzle_layout.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
swizzle_ptr.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
swizzle.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
tensor_predicate.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
tensor.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
tile.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00
underscore.hpp	CUTLASS 3.0.0 (#786 )	2023-01-23 20:55:28 -05:00