Home > Cuda Driver > Cuda Driver Api Reference

Cuda Driver Api Reference

Contents

See also: cuCtxCreate, cuCtxDestroy, cuCtxGetApiVersion, cuCtxGetCacheConfig, cuCtxGetDevice, cuCtxGetFlags, cuCtxPopCurrent, cuCtxPushCurrent, cuCtxSetCacheConfig, cuCtxSetLimit, cuCtxSynchronize CUresult cuCtxGetSharedMemConfig ( CUsharedconfig*pConfig ) Returns the current shared memory configuration for the current context. Device runtime launches which violate this limitation fail and return cudaErrorLaunchPendingCountExceeded when cudaGetLastError() is called after launch. For transfers from any host memory to any host memory, the function is fully synchronous with respect to the host. Returns CUDA_ERROR_INVALID_CONTEXT if there is no current context, or if peerContext is not a valid context. navigate here

This can increase latency when waiting for the GPU, but can increase the performance of CPU threads performing work in parallel with the GPU. Currently, the flags parameter must be 0. CUresultcuDeviceGetCount ( int*count ) Returns the number of compute-capable devices. If these reservations fail, cuCtxSetLimit will return CUDA_ERROR_OUT_OF_MEMORY, and the limit can be reset to a lower value.

Cuda Runtime Api

NVIDIACUDA Toolkit Documentation Search In:Entire SiteJust This Documentclear searchsearch CUDA Toolkit v8.0 CUDA Driver API 1.Difference between the driver and runtime APIs 2.API synchronization behavior 3.Stream synchronization behavior 4.Modules 4.1.Data types Available modes are as follows: CU_COMPUTEMODE_DEFAULT: Default mode - Device is not restricted and can have multiple CUDA contexts present at a single time. This is a misnomer as each function may exhibit synchronous or asynchronous behavior depending on the arguments passed to the function.

If there is no such device, cuDeviceGetCount() returns 0. The context is created with a usage count of 1 and the caller of cuCtxCreate() must call cuCtxDestroy() or when done using the context. Parameters pctx - Returned context handle of the new context dev - Device for which primary context is requested Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY, CUDA_ERROR_UNKNOWN Description Retains the Cuda Error Code 77 Note that access granted by this call is unidirectional and that in order to access memory from the current context in peerContext, a separate symmetric call to cuCtxEnablePeerAccess() is required.

Changing the shared memory bank size will not increase shared memory usage or affect occupancy of kernels, but may have major effects on performance. Cuda Driver Api Vs Runtime Api Functions CUresult cuCtxCreate ( CUcontext*pctx, unsigned int flags, CUdevicedev ) Create a CUDA context. If pageable memory must first be staged to pinned memory, this will be handled asynchronously with a worker thread. http://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX__DEPRECATED.html dstDevice - The destination device of the target link.

Parameters hTexRef - Texture reference to destroy Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE Deprecated DescriptionDestroys the texture reference specified by hTexRef. Cudeviceprimaryctxretain Parameters count - Returned number of compute-capable devices Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE Description Returns in *count the number of devices with compute capability greater than or equal to 1.0 The nvidia-smi tool can be used to set the compute mode for devices. This setting does nothing on devices where the size of the L1 cache and shared memory are fixed.

Cuda Driver Api Vs Runtime Api

See also: cuCtxCreate, cuCtxGetApiVersion, cuCtxGetCacheConfig, cuCtxGetDevice, cuCtxGetFlags, cuCtxGetLimit, cuCtxPopCurrent, cuCtxPushCurrent, cuCtxSetCacheConfig, cuCtxSetLimit, cuCtxSynchronize CUresult cuCtxGetApiVersion ( CUcontextctx, unsigned int*version ) Gets the context's API version. CU_CTX_LMEM_RESIZE_TO_MAX: Instruct CUDA to not reduce local memory after resizing local memory for a kernel. Cuda Runtime Api cuCtxAttach() fails if there is no context current to the thread. Cuda Error Codes List Parameters pi - Returned device attribute value attrib - Device attribute to query dev - Device handle Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_INVALID_DEVICE Description Returns in *pi the integer value

CUresultcuDeviceGetName ( char*name, int len, CUdevicedev ) Returns an identifer string for the device. check over here CU_CTX_BLOCKING_SYNC: Instruct CUDA to block the CPU thread on a synchronization primitive when waiting for the GPU to finish work. If the context was created with the CU_CTX_SCHED_BLOCKING_SYNC flag, the CPU thread will block until the GPU context has finished its work. Note:Note that this function may also return error codes from previous, asynchronous launches. Cumoduleload

Note:Note that this function may also return error codes from previous, asynchronous launches. Parameters ctx - Context to bind to the calling CPU thread Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT Description Binds the specified CUDA context to the calling CPU thread. This function will do nothing on devices with fixed shared memory bank size. http://mdportal.net/cuda-driver/cuda-driver-api.html Functions CUresultcuTexRefCreate ( CUtexref*pTexRef ) Creates a texture reference.

srcDevice - The source device of the target link. Cuctxcreate Note:Note that this function may also return error codes from previous, asynchronous launches. This is only a preference.

Parameters pctx - Returned context handle of the new context flags - Context creation flags dev - Device to create context on Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY, CUDA_ERROR_UNKNOWN

The flags parameter is described below. Parameters dev - Device for which the primary context flags are set flags - New flags for the device Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_DEVICE, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_PRIMARY_CONTEXT_ACTIVE Description Sets the flags for Parameters ctx - Context to destroy Returns CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT Deprecated Note that this function is deprecated and should not be used. Cumemalloc Functions CUresult cuCtxDisablePeerAccess ( CUcontextpeerContext ) Disables direct access to memory allocations in a peer context and unregisters any registered allocations.

The function will return once the pageable buffer has been copied to the staging memory for DMA transfer to device memory, but the DMA to final destination may not have completed. In the reference documentation, each memcpy function is categorized as synchronous or asynchronous, corresponding to the definitions below. If no context is bound to the calling CPU thread then *pctx is set to NULL and CUDA_SUCCESS is returned. weblink Note:Note that this function may also return error codes from previous, asynchronous launches.

The nvidia-smi tool can be used to set the compute mode for * devices. The supported bank configurations are: CU_SHARED_MEM_CONFIG_DEFAULT_BANK_SIZE: set bank width to the default initial setting (currently, four bytes). CU_SHARED_MEM_CONFIG_FOUR_BYTE_BANK_SIZE: set shared memory bank width to be natively four bytes. CU_LIMIT_PRINTF_FIFO_SIZE controls the size in bytes of the FIFO used by the printf() device system call.

CU_CTX_LMEM_RESIZE_TO_MAX: Instruct CUDA to not reduce local memory after resizing local memory for a kernel.