Template Struct block_raking_layout

Nested Relationships

Nested Types

Struct Documentation

template<typename T, int BLOCK_THREADS, int ARCH = HIPCUB_ARCH>
struct hipcub::block_raking_layout

BlockRakingLayout provides a conflict-free shared memory layout abstraction for 1D raking across thread block data.

Overview

This type facilitates a shared memory usage pattern where a block of CUDA threads places elements into shared memory and then reduces the active parallelism to one “raking” warp of threads for serially aggregating consecutive sequences of shared items. Padding is inserted to eliminate bank conflicts (for most data types).

Template Parameters
  • T – The data type to be exchanged.

  • BLOCK_THREADS – The thread block size in threads.

  • PTX_ARCH[optional]

Public Types

enum [anonymous]

Values:

enumerator SHARED_ELEMENTS

The total number of elements that need to be cooperatively reduced.

enumerator MAX_RAKING_THREADS

Maximum number of warp-synchronous raking threads.

enumerator SEGMENT_LENGTH

Number of raking elements per warp-synchronous raking thread (rounded up)

enumerator RAKING_THREADS

Never use a raking thread that will have no valid data (e.g., when BLOCK_THREADS is 62 and SEGMENT_LENGTH is 2, we should only use 31 raking threads)

enumerator USE_SEGMENT_PADDING

Pad each segment length with one element if segment length is not relatively prime to warp size and can’t be optimized as a vector load.

enumerator GRID_ELEMENTS

Total number of elements in the raking grid.

enumerator UNGUARDED

Whether or not we need bounds checking during raking (the number of reduction elements is not a multiple of the number of raking threads)

Public Static Functions

__device__ static inline T *PlacementPtr(TempStorage &temp_storage, unsigned int linear_tid)

Returns the location for the calling thread to place data into the grid.

__device__ static inline T *RakingPtr(TempStorage &temp_storage, unsigned int linear_tid)

Returns the location for the calling thread to begin sequential raking.

struct TempStorage : public hipcub::Uninitialized<_TempStorage>

Alias wrapper allowing storage to be unioned.

Public Types

enum [anonymous]

Values:

typedef UnitWord<_TempStorage>::DeviceWord DeviceWord

Biggest memory-access word that T is a whole multiple of and is not larger than the alignment of T.

Public Functions

__host__ __device__ __forceinline__ inline _TempStorage &Alias()

Alias.

Public Members

DeviceWord storage[WORDS]

Backing storage.