Template Class BlockShuffle

Inheritance Relationships

Base Type

  • public rocprim::block_shuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z >

Class Documentation

template<typename T, int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = HIPCUB_ARCH>
class hipcub::BlockShuffle : public rocprim::block_shuffle<T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z>

Public Types

using TempStorage = typename base_type::storage_type

Public Functions

__device__ inline BlockShuffle()
__device__ inline BlockShuffle(TempStorage &temp_storage)
Parameters

temp_storage – Reference to memory allocation having layout type TempStorage

__device__ inline void Offset(T input, T &output, int distance = 1)

Each thread obtains the input provided by thread+. The offset distance may be negative.

Parameters
  • input – The input item from the calling thread (thread)

  • output – The input item from the successor (or predecessor) thread thread+ (may be aliased to input). This value is only updated for for thread when 0 <= (i + distance) < BLOCK_THREADS-1

  • distance – Offset distance (may be negative)

__device__ inline void Rotate(T input, T &output, unsigned int distance = 1)

Each thread obtains the input provided by thread+.

Parameters
  • input – The calling thread’s input item

  • output – The input item from thread thread(+)% (may be aliased to input). This value is not updated for threadBLOCK_THREADS-1

  • distance – Offset distance (0 < distance < BLOCK_THREADS)

template<int ITEMS_PER_THREAD>
__device__ inline void Up(T (&input)[ITEMS_PER_THREAD], T (&prev)[ITEMS_PER_THREAD])

The thread block rotates its of input items, shifting it up by one item.

Parameters
  • input – The calling thread’s input items

  • prev – The corresponding predecessor items (may be aliased to input). The item prev[0] is not updated for thread0.

template<int ITEMS_PER_THREAD>
__device__ inline void Up(T (&input)[ITEMS_PER_THREAD], T (&prev)[ITEMS_PER_THREAD], T &block_suffix)

The thread block rotates its of input items, shifting it up by one item. All threads receive the input provided by thread.

Parameters
  • input – The calling thread’s input items

  • prev – The corresponding predecessor items (may be aliased to input). The item prev[0] is not updated for thread0.

  • block_suffix – The item input[ITEMS_PER_THREAD-1] from thread, provided to all threads

template<int ITEMS_PER_THREAD>
__device__ inline void Down(T (&input)[ITEMS_PER_THREAD], T (&next)[ITEMS_PER_THREAD])

The thread block rotates its of input items, shifting it down by one item.

Parameters
  • input – The calling thread’s input items

  • next – The corresponding predecessor items (may be aliased to input). The value next[0] is not updated for threadBLOCK_THREADS-1.

template<int ITEMS_PER_THREAD>
__device__ inline void Down(T (&input)[ITEMS_PER_THREAD], T (&next)[ITEMS_PER_THREAD], T &block_prefix)

The thread block rotates its of input items, shifting it down by one item. All threads receive input[0] provided by thread.

Parameters
  • input – The calling thread’s input items

  • next – The corresponding predecessor items (may be aliased to input). The value next[0] is not updated for threadBLOCK_THREADS-1.

  • block_prefix – The item input[0] from thread, provided to all threads