libcudf  0.7
Classes | Typedefs | Functions
cudf Namespace Reference

Invokes an instance of a functor template with the appropriate type determined by a gdf_dtype enum value. More...

Classes

struct  cuda_error
 Exception thrown when a CUDA error is encountered. More...
 
struct  DeviceAnd
 
struct  DeviceMax
 
struct  DeviceMin
 
struct  DeviceOr
 
struct  DeviceProduct
 
struct  DeviceSum
 
struct  DeviceXor
 
struct  logic_error
 Exception thrown when logical precondition is violated. More...
 
struct  table
 A wrapper for a set of gdf_columns of equal number of rows. More...
 

Typedefs

using category = detail::wrapper< gdf_category, GDF_CATEGORY >
 
using nvstring_category = detail::wrapper< gdf_nvstring_category, GDF_STRING_CATEGORY >
 
using timestamp = detail::wrapper< gdf_timestamp, GDF_TIMESTAMP >
 
using date32 = detail::wrapper< gdf_date32, GDF_DATE32 >
 
using date64 = detail::wrapper< gdf_date64, GDF_DATE64 >
 
using bool8 = detail::wrapper< gdf_bool8, GDF_BOOL8 >
 

Functions

rmm::device_vector< bit_mask::bit_mask_t > row_bitmask (cudf::table const &table, cudaStream_t stream=0)
 Computes a bitmask indicating the presence of NULL values in rows of a table. More...
 
void gather (table const *source_table, gdf_index_type const gather_map[], table *destination_table)
 Gathers the rows (including null values) of a set of source columns into a set of destination columns. More...
 
void scatter (table const *source_table, gdf_index_type const scatter_map[], table *destination_table)
 Scatters the rows (including null values) of a set of source columns into a set of destination columns. More...
 
gdf_scalar reduction (const gdf_column *col, gdf_reduction_op op, gdf_dtype output_dtype)
 Computes the reduction of the values in all rows of a column This function does not detect overflows in reductions. Using a higher precision dtype may prevent overflow. Only min and max ops are supported for reduction of non-arithmetic types (date32, timestamp, category...). The null values are skipped for the operation. If the column is empty, the member is_valid of the output gdf_scalar will contain false. More...
 
void scan (const gdf_column *input, gdf_column *output, gdf_scan_op op, bool inclusive)
 Computes the scan (a.k.a. prefix sum) of a column. The null values are skipped for the operation, and if an input element at i is null, then the output element at i will also be null. More...
 
gdf_columnrolling_window (const gdf_column &input_col, gdf_size_type window, gdf_size_type min_periods, gdf_size_type forward_window, gdf_agg_op agg_type, const gdf_size_type *window_col, const gdf_size_type *min_periods_col, const gdf_size_type *forward_window_col, cudaStream_t stream)
 
std::vector< gdf_dtype > column_dtypes (cudf::table const &table)
 Returns vector of the dtypes of the columns in a table. More...
 
bool has_nulls (cudf::table const &table)
 Indicates if a table contains any null values. More...
 
template<typename T , typename BinaryOp >
__forceinline__ __device__ T genericAtomicOperation (T *address, T const &update_value, BinaryOp op)
 reads the old located at the address in global or shared memory, computes 'BinaryOp'('old', 'update_value'), and stores the result back to memory at the same address. These three operations are performed in one atomic transaction. More...
 
template<class functor_t , typename... Ts>
decltype(auto) CUDA_HOST_DEVICE_CALLABLE type_dispatcher (gdf_dtype dtype, functor_t f, Ts &&... args)
 
template<typename T >
constexpr gdf_dtype gdf_dtype_of ()
 Maps a C++ type to it's corresponding gdf_dtype. More...
 
template<>
constexpr gdf_dtype gdf_dtype_of< int8_t > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< int16_t > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< int32_t > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< int64_t > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< float > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< double > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< cudf::bool8 > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< cudf::date32 > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< cudf::date64 > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< cudf::timestamp > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< cudf::category > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< cudf::nvstring_category > ()
 
template<>
constexpr gdf_dtype gdf_dtype_of< NVStrings > ()
 

Detailed Description

Invokes an instance of a functor template with the appropriate type determined by a gdf_dtype enum value.

This helper function accepts any object with an "operator()" template, e.g., a functor. It will invoke an instance of the template by passing in as the template argument an appropriate type determined by the value of the gdf_dtype argument.

The template may have 1 or more template parameters, but the first parameter must be the type dispatched from the gdf_dtype enum. The remaining template parameters must be able to be automatically deduced.

There is a 1-to-1 mapping of gdf_dtype enum values and dispatched types. However, different gdf_dtype values may have the same underlying type. Therefore, in order to provide the 1-to-1 mapping, a wrapper struct may be dispatched for certain gdf_dtype enum values in order to emulate a "strong typedef".

A strong typedef provides a new, concrete type unlike a normal C++ typedef which is simply a type alias. These "strong typedef" structs simply wrap a single member variable of a fundamental type called 'value'.

The standard arithmetic operators are defined for the wrapper structs and therefore the wrapper struct types can be used as if they were fundamental types.

See wrapper_types.hpp for more detail.

Example usage with a functor that returns the size of the dispatched type:

struct example_functor{ template <typename t>=""> int operator()(){ return sizeof(T); } };

cudf::type_dispatcher(GDF_INT8, example_functor); // returns 1 cudf::type_dispatcher(GDF_INT64, example_functor); // returns 8

Example usage of a functor for checking if element "i" in column "lhs" is equal to element "j" in column "rhs":

struct elements_are_equal{ template <typename columntype>=""> bool operator()(void const * lhs, int i, void const * rhs, int j) { // Cast the void* data buffer to the dispatched type and retrieve elements // "i" and "j" from the respective columns ColumnType const i_elem = static_cast<ColumnType const*>(lhs)[i]; ColumnType const j_elem = static_cast<ColumnType const*>(rhs)[j];

// operator== is defined for wrapper structs such that it performs the // operator== on the underlying values. Therefore, the wrapper structs // can be used as if they were fundamental arithmetic types return i_elem == j_elem; } };

The return type for all template instantiations of the functor's "operator()" lambda must be the same, else there will be a compiler error as you would be trying to return different types from the same function.

NOTE: It is undefined behavior if an unsupported or invalid gdf_dtype is supplied.

Parameters
dtypeThe gdf_dtype enum that determines which type will be dispatched
fThe functor with a templated "operator()" that will be invoked with the dispatched type
argsA parameter-pack (i.e., arbitrary number of arguments) that will be perfectly-forwarded as the arguments of the functor's "operator()".
Returns
Whatever is returned by the functor's "operator()".

Function Documentation

◆ column_dtypes()

std::vector< gdf_dtype > cudf::column_dtypes ( cudf::table const &  table)

Returns vector of the dtypes of the columns in a table.

------------------------------------------------------------------------—*

Parameters
tableThe table to get the column dtypes from

std::vector<gdf_dtype>

◆ gather()

void cudf::gather ( table const *  source_table,
gdf_index_type const  gather_map[],
table destination_table 
)

Gathers the rows (including null values) of a set of source columns into a set of destination columns.

The two sets of columns must have equal numbers of columns.

Gathers the rows of the source columns into the destination columns according to a gather map such that row "i" in the destination columns will contain row "gather_map[i]" from the source columns.

The datatypes between coresponding columns in the source and destination columns must be the same.

The number of elements in the gather_map must equal the number of rows in the destination columns.

If any index in the gather_map is outside the range [0, num rows in source_columns), the result is undefined.

If the same index appears more than once in gather_map, the result is undefined.

Parameters
[in]source_tableThe input columns whose rows will be gathered
[in]gather_mapAn array of indices that maps the rows in the source columns to rows in the destination columns.
[out]destination_tableA preallocated set of columns with a number of rows equal in size to the number of elements in the gather_map that will contain the rearrangement of the source columns based on the mapping. Can be the same as source_table (in-place gather).

GDF_SUCCESS upon successful completion

◆ gdf_dtype_of()

template<typename T >
constexpr gdf_dtype cudf::gdf_dtype_of ( )
inline

Maps a C++ type to it's corresponding gdf_dtype.

------------------------------------------------------------------------—* When explicitly passed a template argument of a given type, returns the appropriate gdf_dtype for the specified C++ type.

For example:

return gdf_dtype_of<int32_t>(); // Returns GDF_INT32
return gdf_dtype_of<cudf::category>(); // Returns GDF_CATEGORY

T The type to map to a gdf_dtype

◆ genericAtomicOperation()

template<typename T , typename BinaryOp >
__forceinline__ __device__ T cudf::genericAtomicOperation ( T *  address,
T const &  update_value,
BinaryOp  op 
)

reads the old located at the address in global or shared memory, computes 'BinaryOp'('old', 'update_value'), and stores the result back to memory at the same address. These three operations are performed in one atomic transaction.

----------------------------------------------------------------------—* The supported cudf types for genericAtomicOperation are: int8_t, int16_t, int32_t, int64_t, float, double, cudf::date32, cudf::date64, cudf::timestamp, cudf::category.

Parameters
[in]addressThe address of old value in global or shared memory
[in]valThe value to be added

The old value at address

◆ has_nulls()

bool cudf::has_nulls ( cudf::table const &  table)

Indicates if a table contains any null values.

------------------------------------------------------------------------—*

Parameters
tableThe table to check for null values
Returns
true If the table contains one or more null values

false If the table contains zero null values

◆ reduction()

gdf_scalar cudf::reduction ( const gdf_column col,
gdf_reduction_op  op,
gdf_dtype  output_dtype 
)

Computes the reduction of the values in all rows of a column This function does not detect overflows in reductions. Using a higher precision dtype may prevent overflow. Only min and max ops are supported for reduction of non-arithmetic types (date32, timestamp, category...). The null values are skipped for the operation. If the column is empty, the member is_valid of the output gdf_scalar will contain false.

-----------------------------------------------------------------------—*

Parameters
[in]colInput column
[in]opThe operator applied by the reduction
[in]dtypeThe computation and output precision. dtype must be a data type that is convertible from the input dtype. If the input column has arithmetic type, any arithmetic type can be specified. If the input column has non-arithmetic type (date32, timestamp, category...), the same type must be specified.
Returns
gdf_scalar the result value If the reduction fails, the member is_valid of the output gdf_scalar

will contain false.

◆ row_bitmask()

rmm::device_vector< bit_mask::bit_mask_t > cudf::row_bitmask ( cudf::table const &  table,
cudaStream_t  stream = 0 
)

Computes a bitmask indicating the presence of NULL values in rows of a table.

------------------------------------------------------------------------—* If a row i in table contains one or more NULL values, then bit i in the returned bitmask will be 0.

Otherwise, bit i will be 1.

Parameters
tableThe table to compute the row bitmask of.
Returns
bit_mask::bit_mask_t* The bitmask indicating the presence of NULLs in

a row

◆ scan()

void cudf::scan ( const gdf_column input,
gdf_column output,
gdf_scan_op  op,
bool  inclusive 
)

Computes the scan (a.k.a. prefix sum) of a column. The null values are skipped for the operation, and if an input element at i is null, then the output element at i will also be null.

-----------------------------------------------------------------------—*

Parameters
[in]inputThe input column for the san
[out]outputThe pre-allocated output column
[in]opThe operation of the scan
[in]inclusiveThe flag for applying an inclusive scan if true,

an exclusive scan if false.

◆ scatter()

void cudf::scatter ( table const *  source_table,
gdf_index_type const  scatter_map[],
table destination_table 
)

Scatters the rows (including null values) of a set of source columns into a set of destination columns.

The two sets of columns must have equal numbers of columns.

Scatters the rows of the source columns into the destination columns according to a scatter map such that row "i" from the source columns will be scattered to row "scatter_map[i]" in the destination columns.

The datatypes between coresponding columns in the source and destination columns must be the same.

The number of elements in the scatter_map must equal the number of rows in the source columns.

If any index in scatter_map is outside the range of [0, num rows in destination_columns), the result is undefined.

If the same index appears more than once in scatter_map, the result is undefined.

[in] source_table The columns whose rows will be scattered [in] scatter_map An array that maps rows in the input columns to rows in the output columns. [out] destination_table A preallocated set of columns with a number of rows equal in size to the maximum index contained in scatter_map

GDF_SUCCESS upon successful completion