Class for managing the lifetime of a given gpu allocation. As instances of this class own an underlying resource they are not copyable and must be std::move'd (thus invalidating the original variable) or references made. More...

#include <cuda_internal.h>

Public Member Functions
	GPUAllocation (void)=default

	GPUAllocation (GPUAllocation &&other) noexcept

	GPUAllocation (const GPUAllocation &)=delete

GPUAllocation &	operator= (GPUAllocation &&) noexcept

GPUAllocation &	operator= (const GPUAllocation &)=delete

	~GPUAllocation ()

	operator bool (void) const

OsHandle	get_os_handle (void) const
	Accessor for the file descriptor or win32 HANDLE associated with the allocation. This handle can be shared with other APIs or other processes and opened with GPUAllocation::open_handle.

bool	get_ipc_handle (CUipcMemHandle &handle) const
	Retrieves the CUipcMemHandle for this allocation that can be used with GPUAllocation::open_ipc.

CUdeviceptr	get_dptr (void) const
	Retrieves the base CUdeviceptr for the associated allocation that can be used to access the underlying memory of the allocation from the device or with cuda apis that take a CUdeviceptr.

GPU *	get_gpu (void) const
	Retrieves the owning GPU.

size_t	get_size (void) const
	Retrieves the given size of the allocation.

template<typename T = void>
T *	get_hptr (void) const
	Retrieves the CPU accessible base address for the allocation, or nullptr if there is no way to access this allocation from the CPU.

Static Public Member Functions
static void *	get_win32_shared_attributes (void)
	Retrieves the default win32 shared attributes for creating a shared object that can be set in CUmemAllocationProp::win32Metadata and passed to GPUAllocation::allocate_mmap.

static GPUAllocation *	allocate_dev (GPU *gpu, size_t size, bool peer_enabled=true, bool shareable=true)
	Allocates device-located memory for the given gpu with the given size and features.

static GPUAllocation *	allocate_host (GPU *gpu, size_t size, bool peer_enabled=true, bool shareable=true, bool same_va=true)
	Allocate CPU-located memory for the given gpu with the given size and features.

static GPUAllocation *	allocate_managed (GPU *gpu, size_t size)
	Allocate migratable memory that can be used with CUDA's managed memory APIs (cuMemPrefetchAsync, etc).

static GPUAllocation *	register_allocation (GPU gpu, void ptr, size_t size, bool peer_enabled=true)
	Create an allocation that registers the given CPU address range with CUDA, making it accessible from the device.

static GPUAllocation *	open_ipc (GPU *gpu, const CUipcMemHandle &mem_hdl)
	Retrieves the GPUAllocation given the CUipcMemHandle.

static GPUAllocation *	open_handle (GPU *gpu, OsHandle hdl, size_t size, bool peer_enabled=true)
	Retrieves the GPUAllocation given the OsHandle.

Detailed Description

Class for managing the lifetime of a given gpu allocation. As instances of this class own an underlying resource they are not copyable and must be std::move'd (thus invalidating the original variable) or references made.

Constructor & Destructor Documentation

◆ GPUAllocation() [1/3]

Realm::Cuda::GPUAllocation::GPUAllocation ( void )

default

◆ GPUAllocation() [2/3]

Realm::Cuda::GPUAllocation::GPUAllocation ( GPUAllocation && other )

noexcept

◆ GPUAllocation() [3/3]

Realm::Cuda::GPUAllocation::GPUAllocation ( const GPUAllocation & )

delete

◆ ~GPUAllocation()

Realm::Cuda::GPUAllocation::~GPUAllocation ( )

Member Function Documentation

◆ allocate_dev()

static GPUAllocation * Realm::Cuda::GPUAllocation::allocate_dev	(	GPU *	gpu,
		size_t	size,
		bool	peer_enabled = `true`,
		bool	shareable = `true`
	)

static

Allocates device-located memory for the given gpu with the given size and features.

Parameters

gpu	GPU this allocation is destined for and for which it's lifetime is assured
size	Size of the requested allocation
peer_enabled	True if the allocation needs to be accessible via all the GPU's peers
shareable	True if the allocation needs to be shareable

Returns: The GPUAllocation, or nullptr if unsuccessful

◆ allocate_host()

static GPUAllocation * Realm::Cuda::GPUAllocation::allocate_host	(	GPU *	gpu,
		size_t	size,
		bool	peer_enabled = `true`,
		bool	shareable = `true`,
		bool	same_va = `true`
	)

static

Allocate CPU-located memory for the given gpu with the given size and features.

Parameters

gpu	GPU this allocation is destined for and for which it's lifetime is assured
size	Size of the requested allocation
peer_enabled	True if the allocation needs to be accessible via all the GPU's peers
shareable	True if the allocation needs to be shareable
same_va	True if the allocation must have the same GPU and CPU virtual addresses (e.g. get_dptr() == get_hptr()).

Returns: The GPUAllocation, or nullptr if unsuccessful

◆ allocate_managed()

static GPUAllocation * Realm::Cuda::GPUAllocation::allocate_managed	(	GPU *	gpu,
		size_t	size
	)

static

Allocate migratable memory that can be used with CUDA's managed memory APIs (cuMemPrefetchAsync, etc).

Parameters

gpu	GPU this allocation is destined for and for which it's lifetime is assured
size	Size of the requested allocation

Returns: The GPUAllocation, or nullptr if unsuccessful

◆ get_dptr()

CUdeviceptr Realm::Cuda::GPUAllocation::get_dptr ( void ) const

inline

Retrieves the base CUdeviceptr for the associated allocation that can be used to access the underlying memory of the allocation from the device or with cuda apis that take a CUdeviceptr.

Note: This device pointer is relative to the owning GPU, not to other GPUs. This is typically not an issue unless CUDA's unified virtual addressing is not available, which for almost all supported systems is always the case and is detected and handled elsewhere.

Returns: The device address to use.

◆ get_gpu()

GPU * Realm::Cuda::GPUAllocation::get_gpu ( void ) const

inline

Retrieves the owning GPU.

Returns: GPU that owns this allocation

◆ get_hptr()

template<typename T = void>

T * Realm::Cuda::GPUAllocation::get_hptr ( void ) const

inline

Retrieves the CPU accessible base address for the allocation, or nullptr if there is no way to access this allocation from the CPU.

Template Parameters

T	The type of the return type to cast to

Returns: The CPU visible address for accessing this allocation, or nullptr if no such access is possible

◆ get_ipc_handle()

bool Realm::Cuda::GPUAllocation::get_ipc_handle ( CUipcMemHandle & handle ) const

inline

Retrieves the CUipcMemHandle for this allocation that can be used with GPUAllocation::open_ipc.

Parameters

handle The CUipcMemHandle associated with this allocation

Returns: True if this allocation has an IPC handle to retrieve, false otherwise.

◆ get_os_handle()

OsHandle Realm::Cuda::GPUAllocation::get_os_handle ( void ) const

Accessor for the file descriptor or win32 HANDLE associated with the allocation. This handle can be shared with other APIs or other processes and opened with GPUAllocation::open_handle.

Note: it is the caller's responsibility to close the handle afterward to prevent resource leaks.

◆ get_size()

size_t Realm::Cuda::GPUAllocation::get_size ( void ) const

inline

Retrieves the given size of the allocation.

Returns: The size of this allocation

◆ get_win32_shared_attributes()

static void * Realm::Cuda::GPUAllocation::get_win32_shared_attributes ( void )

static

Retrieves the default win32 shared attributes for creating a shared object that can be set in CUmemAllocationProp::win32Metadata and passed to GPUAllocation::allocate_mmap.

Returns: A pointer to the default shared attributes. This pointer is internally managed and should never be freed

◆ open_handle()

static GPUAllocation * Realm::Cuda::GPUAllocation::open_handle	(	GPU *	gpu,
		OsHandle	hdl,
		size_t	size,
		bool	peer_enabled = `true`
	)

static

Retrieves the GPUAllocation given the OsHandle.

Parameters

gpu	GPU this allocation is destined for and for which it's lifetime is assured
hdl	The OsHandle e.g. retrieved from GPUAllocation::get_os_handle
size	Size of the requested allocation
peer_enabled	True if this memory needs to be accessible by this GPU's peers

Returns: The GPUAllocation, or nullptr if unsuccessful

◆ open_ipc()

static GPUAllocation * Realm::Cuda::GPUAllocation::open_ipc	(	GPU *	gpu,
		const CUipcMemHandle &	mem_hdl
	)

static

Retrieves the GPUAllocation given the CUipcMemHandle.

Parameters

gpu	GPU this allocation is destined for and for which it's lifetime is assured
mem_hdl	CUipcMemHandle e.g. retrieved from GPUAllocation::get_ipc_handle

Returns: The GPUAllocation, or nullptr if unsuccessful

◆ operator bool()

Realm::Cuda::GPUAllocation::operator bool ( void ) const

inline

◆ operator=() [1/2]

GPUAllocation & Realm::Cuda::GPUAllocation::operator= ( const GPUAllocation & )

delete

◆ operator=() [2/2]

GPUAllocation & Realm::Cuda::GPUAllocation::operator= ( GPUAllocation && )

noexcept

◆ register_allocation()

static GPUAllocation * Realm::Cuda::GPUAllocation::register_allocation	(	GPU *	gpu,
		void *	ptr,
		size_t	size,
		bool	peer_enabled = `true`
	)

static

Create an allocation that registers the given CPU address range with CUDA, making it accessible from the device.

Note: This object instance only manages the lifetime of the registration to CUDA. The given address range must be managed externally.

Parameters

gpu	GPU this allocation is destined for and for which it's lifetime is assured
ptr	Base address to register.
size	Size of the requested allocation
peer_enabled	True if this memory needs to be accessible by this GPU's peers

Returns: The GPUAllocation, or nullptr if unsuccessful

The documentation for this class was generated from the following file:

/home/runner/work/realm/realm/realm-src/src/realm/cuda/cuda_internal.h

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ GPUAllocation() [1/3]

◆ GPUAllocation() [2/3]

◆ GPUAllocation() [3/3]

◆ ~GPUAllocation()

Member Function Documentation

◆ allocate_dev()

◆ allocate_host()

◆ allocate_managed()

◆ get_dptr()

◆ get_gpu()

◆ get_hptr()

◆ get_ipc_handle()

◆ get_os_handle()

◆ get_size()

◆ get_win32_shared_attributes()

◆ open_handle()

◆ open_ipc()

◆ operator bool()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ register_allocation()