Struct CachingDeviceAllocator

Nested Relationships

Nested Types

Inheritance Relationships

Base Type

  • public CachingDeviceAllocator

Struct Documentation

struct hipcub::CachingDeviceAllocator : public CachingDeviceAllocator

Public Types

typedef bool (*Compare)(const BlockDescriptor&, const BlockDescriptor&)

BlockDescriptor comparator function interface.

typedef std::multiset<BlockDescriptor, Compare> CachedBlocks

Set type for cached blocks (ordered by size)

typedef std::multiset<BlockDescriptor, Compare> BusyBlocks

Set type for live blocks (ordered by ptr)

typedef std::map<int, TotalBytes> GpuCachedBytes

Map type of device ordinals to the number of cached bytes cached by each device.

Public Functions

inline hipError_t SetMaxCachedBytes(size_t max_cached_bytes)
inline hipError_t DeviceAllocate(int device, void **d_ptr, size_t bytes, hipStream_t active_stream = 0)
inline hipError_t DeviceAllocate(void **d_ptr, size_t bytes, hipStream_t active_stream = 0)
inline hipError_t DeviceFree(int device, void *d_ptr)
inline hipError_t DeviceFree(void *d_ptr)
inline hipError_t FreeAllCached()
inline void NearestPowerOf(unsigned int &power, size_t &rounded_bytes, unsigned int base, size_t value)

Round up to the nearest power-of

inline CachingDeviceAllocator(unsigned int bin_growth, unsigned int min_bin = 1, unsigned int max_bin = INVALID_BIN, size_t max_cached_bytes = INVALID_SIZE, bool skip_cleanup = false, bool debug = false)

Set of live device allocations currently in use.

Constructor.

Parameters
  • bin_growth – Geometric growth factor for bin-sizes

  • min_bin – Minimum bin (default is bin_growth ^ 1)

  • max_bin – Maximum bin (default is no max bin)

  • max_cached_bytes – Maximum aggregate cached bytes per device (default is no limit)

  • skip_cleanup – Whether or not to skip a call to FreeAllCached() when the destructor is called (default is to deallocate)

  • debug – Whether or not to print (de)allocation events to stdout (default is no stderr output)

inline CachingDeviceAllocator(bool skip_cleanup = false, bool debug = false)

Default constructor.

Configured with:

which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device

  • bin_growth = 8

  • min_bin = 3

  • max_bin = 7

  • max_cached_bytes = (bin_growth ^ max_bin) * 3) - 1 = 6,291,455 bytes

inline hipError_t SetMaxCachedBytes(size_t max_cached_bytes)

Sets the limit on the number bytes this allocator is allowed to cache per device.

Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See FreeAllCached().

inline hipError_t DeviceAllocate(int device, void **d_ptr, size_t bytes, hipStream_t active_stream = 0)

Provides a suitable allocation of device memory for the given size on the specified device.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Parameters
  • device – Device on which to place the allocation

  • d_ptr – Reference to pointer to the allocation

  • bytes – Minimum number of bytes for the allocation

  • active_stream – The stream to be associated with this allocation

inline hipError_t DeviceAllocate(void **d_ptr, size_t bytes, hipStream_t active_stream = 0)

Provides a suitable allocation of device memory for the given size on the current device.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Parameters
  • d_ptr – Reference to pointer to the allocation

  • bytes – Minimum number of bytes for the allocation

  • active_stream – The stream to be associated with this allocation

inline hipError_t DeviceFree(int device, void *d_ptr)

Frees a live allocation of device memory on the specified device, returning it to the allocator.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

inline hipError_t DeviceFree(void *d_ptr)

Frees a live allocation of device memory on the current device, returning it to the allocator.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

inline hipError_t FreeAllCached()

Frees all cached device allocations on all devices.

inline virtual ~CachingDeviceAllocator()

Destructor.

Public Members

std::mutex mutex
unsigned int bin_growth

Mutex for thread-safety.

unsigned int min_bin

Geometric growth factor for bin-sizes.

unsigned int max_bin

Minimum bin enumeration.

size_t min_bin_bytes

Maximum bin enumeration.

size_t max_bin_bytes

Minimum bin size.

size_t max_cached_bytes

Maximum bin size.

const bool skip_cleanup

Maximum aggregate cached bytes per device.

bool debug

Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators)

GpuCachedBytes cached_bytes

Whether or not to print (de)allocation events to stdout.

CachedBlocks cached_blocks

Map of device ordinal to aggregate cached bytes on that device.

BusyBlocks live_blocks

Set of cached device allocations available for reuse.

Public Static Functions

static inline unsigned int IntPow(unsigned int base, unsigned int exp)

Integer pow function for unsigned base and exponent

Public Static Attributes

static const unsigned int INVALID_BIN = (unsigned int)-1

Out-of-bounds bin.

static const size_t INVALID_SIZE = (size_t)-1

Invalid size.

static const int INVALID_DEVICE_ORDINAL = -1

Invalid device ordinal.

struct BlockDescriptor

Descriptor for device memory allocations

Public Functions

inline BlockDescriptor(void *d_ptr, int device)
inline BlockDescriptor(int device)

Public Members

void *d_ptr
size_t bytes
unsigned int bin
int device
hipStream_t associated_stream
hipEvent_t ready_event

Public Static Functions

static inline bool PtrCompare(const BlockDescriptor &a, const BlockDescriptor &b)
static inline bool SizeCompare(const BlockDescriptor &a, const BlockDescriptor &b)
class TotalBytes

Public Functions

inline TotalBytes()

Public Members

size_t free
size_t live