cutlass

squall/cutlass

Fork 0

Commit Graph

Author	SHA1	Message	Date
Alexander Pivovarov	7e370c9637	Fix typos 2 (#842 ) Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>	2023-03-09 23:22:56 -05:00
Mark Hoemmen	add4ba622f	Fix 8.4 + CUDA 11.4 build (#789 ) Work around a likely GCC 8.x issue with fold expressions and generic lambdas. Only use the work-around when the host compiler is GCC 8.x. This avoids any concerns about the work-around possibly hindering inlining for a critical CuTe function (product). Users can experiment with the work-around for other compilers or compiler versions by defining the following macro. CUTE_FOLD_GENERIC_LAMBDA_WORKAROUND Fixes https://github.com/NVIDIA/cutlass/issues/788 Co-authored-by: Mark Hoemmen <mhoemmen@nvidia.com>	2023-01-27 09:18:59 -05:00
Vijay Thakkar	277bd6e537	CUTLASS 3.0.0 (#786 ) * CUTLASS 3.0.0	2023-01-23 20:55:28 -05:00

Author

SHA1

Message

Date

Alexander Pivovarov

7e370c9637

Fix typos 2 (#842 )

Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>

2023-03-09 23:22:56 -05:00

Mark Hoemmen

add4ba622f

Fix 8.4 + CUDA 11.4 build (#789 )

Work around a likely GCC 8.x issue with fold expressions
and generic lambdas.

Only use the work-around when the host compiler is GCC 8.x.
This avoids any concerns about the work-around possibly
hindering inlining for a critical CuTe function (product).

Users can experiment with the work-around for other compilers
or compiler versions by defining the following macro.

CUTE_FOLD_GENERIC_LAMBDA_WORKAROUND

Fixes https://github.com/NVIDIA/cutlass/issues/788

Co-authored-by: Mark Hoemmen <mhoemmen@nvidia.com>

2023-01-27 09:18:59 -05:00

Vijay Thakkar

277bd6e537

CUTLASS 3.0.0 (#786 )

* CUTLASS 3.0.0

2023-01-23 20:55:28 -05:00

1 2

53 Commits