Template Class BlockRadixRank¶
Defined in File block_radix_rank.hpp
Nested Relationships¶
Nested Types¶
Class Documentation¶
BlockRadixRank provides operations for ranking unsigned integer types within a CUDA thread block.
- Overview
Blah…
Keys must be in a form suitable for radix ranking (i.e., unsigned bits).
- Performance Considerations
- Examples
Example 1: Simple radix rank of 32-bit integer keys
#include <hipcub/hipcub.hpp> template <int BLOCK_THREADS> __global__ void ExampleKernel(...) {
- Template Parameters
BLOCK_DIM_X – The thread block length in threads along the X dimension
RADIX_BITS – The number of radix bits per digit place
IS_DESCENDING – Whether or not the sorted-order is high-to-low
MEMOIZE_OUTER_SCAN – [optional] Whether or not to buffer outer raking scan partials to incur fewer shared memory reads at the expense of higher register pressure (default: true for architectures SM35 and newer, false otherwise). See BlockScanAlgorithm::BLOCK_SCAN_RAKING_MEMOIZE for more details.
INNER_SCAN_ALGORITHM – [optional] The hipcub::BlockScanAlgorithm algorithm to use (default: hipcub::BLOCK_SCAN_WARP_SCANS)
SMEM_CONFIG – [optional] Shared memory bank mode (default:
hipSharedMemBankSizeFourByte)BLOCK_DIM_Y – [optional] The thread block length in threads along the Y dimension (default: 1)
BLOCK_DIM_Z – [optional] The thread block length in threads along the Z dimension (default: 1)
ARCH – [optional]
Collective constructors
-
__device__ inline BlockRadixRank()¶
Collective constructor using a private static allocation of shared memory as temporary storage.
-
__device__ inline BlockRadixRank(TempStorage &temp_storage)¶
Collective constructor using the specified memory allocation as temporary storage.
- Parameters
temp_storage – Reference to memory allocation having layout type TempStorage
Raking
-
template<typename UnsignedBits, int KEYS_PER_THREAD, typename DigitExtractorT>
__device__ inline void RankKeys(UnsignedBits (&keys)[KEYS_PER_THREAD], int (&ranks)[KEYS_PER_THREAD], DigitExtractorT digit_extractor)¶ Rank keys.
- Parameters
keys – Keys for this tile
ranks – For each key, the local rank within the tile
digit_extractor – The digit extractor
-
template<typename UnsignedBits, int KEYS_PER_THREAD, typename DigitExtractorT>
__device__ inline void RankKeys(UnsignedBits (&keys)[KEYS_PER_THREAD], int (&ranks)[KEYS_PER_THREAD], DigitExtractorT digit_extractor, int (&exclusive_digit_prefix)[BINS_TRACKED_PER_THREAD])¶ Rank keys. For the lower
RADIX_DIGITSthreads, digit counts for each digit are provided for the corresponding thread.- Parameters
keys – Keys for this tile
ranks – For each key, the local rank within the tile (out parameter)
digit_extractor – The digit extractor
exclusive_digit_prefix – The exclusive prefix sum for the digits [(threadIdx.x * BINS_TRACKED_PER_THREAD) … (threadIdx.x * BINS_TRACKED_PER_THREAD) + BINS_TRACKED_PER_THREAD - 1]
Public Types
-
struct TempStorage : public hipcub::Uninitialized<_TempStorage>¶
-
Public Types
-
enum [anonymous]¶
Values:
-
typedef UnitWord<_TempStorage>::DeviceWord DeviceWord¶
Biggest memory-access word that T is a whole multiple of and is not larger than the alignment of T.
Public Functions
-
__host__ __device__ __forceinline__ inline _TempStorage &Alias()¶
Alias.
Public Members
-
DeviceWord storage[WORDS]¶
Backing storage.
-
enum [anonymous]¶