Peano 4
Loading...
Searching...
No Matches
tarch::accelerator Namespace Reference

Namespaces

namespace  cpp
 
namespace  cuda
 
namespace  omp
 
namespace  sycl
 

Data Structures

struct  BenchmarkResultInNanoSeconds
 Struct representing benchmark results in nanoseconds. More...
 
class  Device
 Core. More...
 

Functions

std::vector< std::vector< std::string > > queryDeviceInformation ()
 Queries device information and runs benchmarks on the available accelerators.
 
void printDeviceInformation ()
 Prints device information based on the available accelerator backend.
 
size_t getPropertyOffset (const char name[], const std::vector< std::string > &identifiers=properties)
 Retrieves the offset of a property in a vector of identifiers.
 
std::string parseMatrix (const std::vector< std::vector< std::string > > &propertyMatrix, int numberOfDevices, const std::vector< std::string > &identifiers=properties)
 Parses a matrix of properties into a formatted string representation.
 
void offloadCapabilityTest ()
 Performs offload capability tests for the available GPU accelerator backend.
 
void runBenchmarks (std::vector< std::vector< std::string > > &propertyMatrix)
 Runs benchmarks for the available GPU accelerator backend and populates the property matrix.
 
void parseBenchmarkResults (std::vector< BenchmarkResultInNanoSeconds > &benchmarkResults, std::vector< std::vector< std::string > > &propertyMatrix)
 Parses benchmark results and updates the string representations in the property matrix.
 
static tarch::logging::Log _log ("tarch::accelerator")
 

Variables

const std::vector< std::string > properties
 Vector of strings representing various properties.
 
constexpr uint64_t repeats = 10
 Num of repeats of streaming benchmarks and the number of elements N per vector.
 
constexpr size_t alloc_size = 1024*1024*1024
 
constexpr size_t N = alloc_size / sizeof(double)
 

Function Documentation

◆ _log()

static tarch::logging::Log tarch::accelerator::_log ( "tarch::accelerator" )
static

◆ getPropertyOffset()

size_t tarch::accelerator::getPropertyOffset ( const char name[],
const std::vector< std::string > & identifiers = properties )

Retrieves the offset of a property in a vector of identifiers.

By default the static vector of strings within this namespace is used

This function searches for the given property name in the vector of identifiers. If the property name is found, the function returns its offset (index) in the vector. If the property name is not found, a runtime_error is thrown.

Parameters
nameThe name of the property to retrieve the offset for.
identifiersThe vector of identifiers containing the property names.
Returns
The offset (index) of the property in the vector of identifiers.
Exceptions
std::runtime_errorIf the property name is not found in the vector of identifiers.

Definition at line 69 of file DeviceInfo.cpp.

References it.

Referenced by parseBenchmarkResults().

Here is the caller graph for this function:

◆ offloadCapabilityTest()

void tarch::accelerator::offloadCapabilityTest ( )

Performs offload capability tests for the available GPU accelerator backend.

The specific tests are executed using the respective backend's testKernelLaunch function. The test results are evaluated to determine if the GPU kernels can be successfully launched. The launched kernels are dummy (hello world) kernels just to check whether launch fails or not.

Exceptions
std::runtime_errorIf the offload capability tests fail.
See also
tarch::accelerator::cpp::offloadCapabilityTest
tarch::accelerator::cuda::offloadCapabilityTest
tarch::accelerator::sycl::offloadCapabilityTest
tarch::accelerator::omp::offloadCapabilityTest

Definition at line 160 of file DeviceInfo.cpp.

References logInfo, tarch::accelerator::cpp::testKernelLaunch(), tarch::accelerator::cuda::testKernelLaunch(), tarch::accelerator::omp::testKernelLaunch(), and tarch::accelerator::sycl::testKernelLaunch().

Referenced by peano4::runTestsAndBenchmarks().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ parseBenchmarkResults()

void tarch::accelerator::parseBenchmarkResults ( std::vector< BenchmarkResultInNanoSeconds > & benchmarkResults,
std::vector< std::vector< std::string > > & propertyMatrix )

Parses benchmark results and updates the string representations in the property matrix.

This function takes a vector of benchmark results (one result struct per GPU) and updates the corresponding string representations in the already initialized propertyMatrixs

Parameters
benchmarkResultsThe vector of benchmark results to parse.
benchmarkResultsStringRepresentationsThe property matrix to update with the string representations.

Definition at line 234 of file DeviceInfo.cpp.

References getPropertyOffset().

Referenced by runBenchmarks().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ parseMatrix()

std::string tarch::accelerator::parseMatrix ( const std::vector< std::vector< std::string > > & propertyMatrix,
int numberOfDevices,
const std::vector< std::string > & identifiers = properties )

Parses a matrix of properties into a formatted string representation.

This function takes a matrix of properties, the number of devices, and a vector of identifiers. It formats the properties into a string representation, including headers and proper padding. In the matrix, each column should have the properties from a device, the order should be same as the tarch::accelerator::properties.

Note
The function is intended to use with queryDeviceInformation()and parseBenchmarkResults(...)
Parameters
propertyMatrixThe matrix of properties to parse.
numberOfDevicesThe number of devices.
identifiersThe vector of identifiers containing property names.
Returns
The formatted string representation of the property matrix.

Definition at line 77 of file DeviceInfo.cpp.

References s.

◆ printDeviceInformation()

void tarch::accelerator::printDeviceInformation ( )

Prints device information based on the available accelerator backend.

This function retrieves device information using the queryDeviceInformation and runBenchmarksfunctions, it prints the information using the appropriate backend based on the preprocessor directives (same backend as the queryDeviceInformation). Individual implementations are necessary because some backends hide the number of accelerators from the user, this renders a table output useless and therefore individual decisions are necessary.

See also
queryDeviceInformation
tarch::accelerator::cpp::printDeviceInformation
tarch::accelerator::cuda::printDeviceInformation
tarch::accelerator::sycl::printDeviceInformation
tarch::accelerator::omp::printDeviceInformation

Definition at line 49 of file DeviceInfo.cpp.

References tarch::accelerator::cpp::printDeviceInformation(), tarch::accelerator::cuda::printDeviceInformation(), tarch::accelerator::omp::printDeviceInformation(), tarch::accelerator::sycl::printDeviceInformation(), and queryDeviceInformation().

Referenced by peano4::runTestsAndBenchmarks().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ queryDeviceInformation()

std::vector< std::vector< std::string > > tarch::accelerator::queryDeviceInformation ( )

Queries device information and runs benchmarks on the available accelerators.

This function retrieves information about the available accelerators and performs benchmarking using the appropriate backend based on the preprocessor directives. The device information is stored in a vector of vectors of strings, where each inner vector represents the information of a single property (A column in the Matrix then belongs to one device, and one row belongs to one Property).

Is the CUDA API is available it will be always prefered as it provides the most information alongside SYCL. As of July 23, C++ and OpenMP GPU Offloading hides almost any information available from the user.

Returns
The device information as a Matrix (vector<vector>) of strings.
See also
tarch::accelerator::cpp::queryDeviceInformation
tarch::accelerator::cuda::queryDeviceInformation
tarch::accelerator::sycl::queryDeviceInformation
tarch::accelerator::omp::queryDeviceInformation

Definition at line 28 of file DeviceInfo.cpp.

References tarch::accelerator::cpp::queryDeviceInformation(), tarch::accelerator::cuda::queryDeviceInformation(), tarch::accelerator::omp::queryDeviceInformation(), tarch::accelerator::sycl::queryDeviceInformation(), and runBenchmarks().

Referenced by printDeviceInformation().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ runBenchmarks()

void tarch::accelerator::runBenchmarks ( std::vector< std::vector< std::string > > & propertyMatrix)

Runs benchmarks for the available GPU accelerator backend and populates the property matrix.

The propertyMatrix assumed to be used the matrix initialized properly (this is done by queryBenchmarks)

Every backend implements (for their respective backends) a vector copy benchmark, which is memory bound. The vector B is assigned to vector A where both of them are roughly 1 GB in size. The benchmarks perform 10 repeats, where the result of the first repeat is discarded and the remaining 9 benchmarks' results are averaged.

Parameters
propertyMatrixThe matrix to store the benchmark results.
See also
tarch::accelerator::cpp::runBenchmarks
tarch::accelerator::cuda::runBenchmarks
tarch::accelerator::sycl::runBenchmarks
tarch::accelerator::omp::runBenchmarks

Definition at line 217 of file DeviceInfo.cpp.

References parseBenchmarkResults(), tarch::accelerator::cpp::runBenchmarks(), tarch::accelerator::cuda::runBenchmarks(), tarch::accelerator::omp::runBenchmarks(), and tarch::accelerator::sycl::runBenchmarks().

Referenced by queryDeviceInformation().

Here is the call graph for this function:
Here is the caller graph for this function:

Variable Documentation

◆ alloc_size

constexpr size_t tarch::accelerator::alloc_size = 1024*1024*1024
constexpr

Definition at line 62 of file DeviceInfo.h.

◆ N

constexpr size_t tarch::accelerator::N = alloc_size / sizeof(double)
constexpr

Definition at line 63 of file DeviceInfo.h.

◆ properties

const std::vector<std::string> tarch::accelerator::properties
Initial value:
= {
"Device Name",
"Global Memory",
"Free Memory",
"Clock Frequency",
"Compute Units",
"Bandwidth"}

Vector of strings representing various properties.

This variable is a constant vector of strings that stores the names of various properties used to identify the properties of GPUs.

Using a struct per Device Could be better, but this way it is easier to create a Matrix with more Freedom For cleaner code it can be a future todo since there is only 1 table being printed (initial design was to have multiple remnants and it is a remnant of that design)

Note
This vector is intended to be used in conjunction with other functions or classes that require a list of properties.

Definition at line 22 of file DeviceInfo.h.

◆ repeats

constexpr uint64_t tarch::accelerator::repeats = 10
constexpr

Num of repeats of streaming benchmarks and the number of elements N per vector.

Definition at line 61 of file DeviceInfo.h.