Fix cuTE compilation with clang (#939)
- clang 1.14 complains about missing function from a host call: cutlass/include/cute/arch/util.hpp:106:32: error: no matching function for call to '__cvta_generic_to_shared' return static_cast<uint32_t>(__cvta_generic_to_shared(ptr)); - fixes this by defining CUTE_HOST_DEVICE for clang as well Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
This commit is contained in:
parent
7c04f95415
commit
24c8b7d8a2
@ -30,7 +30,7 @@
|
|||||||
**************************************************************************************************/
|
**************************************************************************************************/
|
||||||
#pragma once
|
#pragma once
|
||||||
|
|
||||||
#if defined(__CUDA_ARCH__) || defined(_NVHPC_CUDA)
|
#if defined(__CUDA_ARCH__) || defined(_NVHPC_CUDA) || defined(__clang__)
|
||||||
# define CUTE_HOST_DEVICE __forceinline__ __host__ __device__
|
# define CUTE_HOST_DEVICE __forceinline__ __host__ __device__
|
||||||
# define CUTE_DEVICE __forceinline__ __device__
|
# define CUTE_DEVICE __forceinline__ __device__
|
||||||
# define CUTE_HOST __forceinline__ __host__
|
# define CUTE_HOST __forceinline__ __host__
|
||||||
|
Loading…
Reference in New Issue
Block a user