70 typedef Shape<Base::Tile::kH / Base::Threads::kH / 4,
72 Base::Tile::kW / Base::Threads::kW,
83 return make_Coord(0, thread_offset_h, thread_offset_w, 0);
Computes the thread offset in (H, W) based on thread ID.
Definition: igemm_global_tile.h:77
Defines iterators for efficiently loading and storing to global memory.
Definition: gemm_global_tile.h:70
A Coord is a coordinate of arbitrary rank into a tensor or matrix.
CUTLASS_HOST_DEVICE Coord< 1 > make_Coord(int _0)
Helper to make a 2-element coordinate.
Definition: coord.h:241
Shape< Base::Threads::kH *4, 1, Base::Threads::kW, Base::kAccessSize > Delta
The strides in each dimension between different loads/stores.
Definition: igemm_global_tile.h:68
static int const kH
The height of the cube.
Definition: shape.h:68
GemmGlobalTileTraits< kOperand_, kLayout_, Scalar_, Tile_, Threads_, kAccessSize_ > Base
The base class.
Definition: igemm_global_tile.h:64
Shape< Base::Tile::kH/Base::Threads::kH/4, 4, Base::Tile::kW/Base::Threads::kW, Base::Tile::kC/Base::kAccessSize > Iterations
The number of iterations needed to load/store the tile.
Definition: igemm_global_tile.h:74
#define CUTLASS_HOST_DEVICE
Definition: cutlass.h:46
Definition: igemm_global_tile.h:50
A Shape implementing Layout Concept describing the dimensions of a cube.
Definition: shape.h:64
static int const kW
The width of the cube.
Definition: shape.h:70
Kind
Definition: matrix_traits.h:36
static int const kAccessSize
The number of scalars per LDG/STG.
Definition: gemm_global_tile.h:80
Kind
Definition: matrix_traits.h:43
ReshapeThreads< Tile, Threads_ >::Threads Threads
The threads shape.
Definition: gemm_global_tile.h:87
Defines properties of matrices used to denote layout and operands to GEMM kernels.
Shape< 1, 4, Base::Tile::kC > ThreadsDelta
The threads strides.
Definition: igemm_global_tile.h:89
CUTLASS_HOST_DEVICE Coord< 4 > operator()() const
Definition: igemm_global_tile.h:79
Base::Threads Threads
The threads.
Definition: igemm_global_tile.h:66