yacx-YetAnotherCudaExecutor  0.6.2
wrapper to easily compile and execute cuda kernels
Classes | Public Member Functions | List of all members
yacx::Kernel Class Reference

Class to help launch and configure a CUDA kernel. More...

#include <Kernel.hpp>

Inheritance diagram for yacx::Kernel:
yacx::JNIHandle

Public Member Functions

 Kernel (std::shared_ptr< char[]> ptx, std::string demangled_name)
 
Kernelconfigure (dim3 grid, dim3 block, unsigned int shared=0)
 
KernelTime launch (KernelArgs args, Device &device=Devices::findDevice())
 
std::vector< KernelTimebenchmark (KernelArgs args, unsigned int executions, Device &device=Devices::findDevice())
 

Detailed Description

Class to help launch and configure a CUDA kernel.

Examples
docs/kernel_launch.cpp, example_gauss.cpp, example_matrix_multiply.cpp, example_saxpy.cpp, and example_template.cpp.

Constructor & Destructor Documentation

◆ Kernel()

Kernel::Kernel ( std::shared_ptr< char[]>  ptx,
std::string  demangled_name 
)

create a Kernel based on a templated kernel string

Parameters
ptx
kernel_name
demangled_name

Member Function Documentation

◆ benchmark()

std::vector< KernelTime > Kernel::benchmark ( KernelArgs  args,
unsigned int  executions,
Device device = Devices::findDevice() 
)

benchmark a Kernel

Parameters
kernel_args
numberof executions
device
Returns
vector of KernelTimes for every execution

◆ configure()

Kernel & Kernel::configure ( dim3  grid,
dim3  block,
unsigned int  shared = 0 
)
Parameters
gridvector of grid dimensions
blockvector of block dimensions
sharedamount of dynamic shared memory to allocate
Returns
this (for method chaining)

◆ launch()

KernelTime Kernel::launch ( KernelArgs  args,
Device device = Devices::findDevice() 
)
Parameters
kernel_args
Returns
KernelTime

The documentation for this class was generated from the following files: