cutlass/include/cute/arch
Gregory Meyer (gregjm) fcfbd23e26
Fix host compilation of cute::cast_smem_ptr_to_uint. (#940)
* Remove references to device-only intrinsics when compiling for host.

Currently, we attempt to use the `__device__`-only functions
`__cvta_generic_to_shared` and `__nvvm_get_smem_pointer` when compiling
`cute::cast_smem_ptr_to_uint` for the host on Clang. This results in a
compilation error, as expected. This commit changes the definition of
the `*_ACTIVATED` macros so that they are only true when `__CUDA_ARCH__`
is defined; that is, when compiling for the device.

Additionally, the declaration of `__nvvm_get_smem_pointer`
is currently only visible during the device compilation pass when
compiling with NVCC; this commit makes the declaration visible during
host compilation with the `__device__` annotation.

* Annotate cute::cast_smem_ptr_to_uint as device-only.

The implementation of `cute::cast_smem_ptr_to_uint` is currently an
unchecked failure on host code, and the only host implementation I can
think of -- casting a probably-64-bit pointer to 32 bits somehow --
doesn't make sense to implement. This commit marks this function as
device-only so that it can't be accidentally used on host code.

* small change

---------

Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2023-05-10 00:06:54 -04:00
..
cluster_sm90.hpp Updates for 3.1 (#932) 2023-04-29 09:34:27 -04:00
copy_sm75.hpp CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
copy_sm80.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
copy_sm90_desc.hpp Add missing checks for NVRTC in CuTe (#921) 2023-04-25 12:52:43 -04:00
copy_sm90_tma.hpp CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
copy_sm90.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
copy.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
mma_sm61.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
mma_sm70.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
mma_sm75.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
mma_sm80.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
mma_sm90_desc.hpp CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
mma_sm90_gmma.hpp CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
mma_sm90.hpp CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
mma.hpp CUTLASS 3.0.0 (#786) 2023-01-23 20:55:28 -05:00
util.hpp Fix host compilation of cute::cast_smem_ptr_to_uint. (#940) 2023-05-10 00:06:54 -04:00