Column functions#

legate_dataframe.lib.unaryop.unary_operation(LogicalColumn col, unary_operator op) LogicalColumn#

Performs unary operation on all values in column

Note: For decimal32 and decimal64, only ABS, CEIL and FLOOR are supported.

Parameters:
  • col – Logical column as input

  • op – Operation to perform, see unary_operator.

Return type:

Logical column of same size as col containing result of the operation.

legate_dataframe.lib.binaryop.binary_operation(lhs: LogicalColumn | ScalarLike, rhs: LogicalColumn | ScalarLike, binary_operator op, output_type: DTypeLike) LogicalColumn#

Performs a binary operation between two columns or a column and a scalar.

The output contains the result of op(lhs[i], rhs[i]) for all 0 <= i < lhs.size() where lhs[i] or rhs[i] (but not both) can be replaced with a scalar value.

Regardless of the operator, the validity of the output value is the logical AND of the validity of the two operands except for NullMin and NullMax (logical OR).

Parameters:
  • lhs – The left operand

  • lhs – The right operand

  • op – The binary operator see binary_operator.

  • output_type – The desired data type of the output column

Returns:

  • Output column of output_type type containing the result of the binary

  • operation

Raises:
  • ValueError – if lhs and rhs are both scalars

  • RuntimeError – if lhs and rhs are different sizes

  • RuntimeError – if output_type dtype isn’t boolean for comparison and logical operations.

  • RuntimeError – if output_type dtype isn’t fixed-width

  • RuntimeError – if the operation is not supported for the types of lhs and rhs

legate_dataframe.lib.timestamps.to_timestamps(LogicalColumn col, timestamp_type: DTypeLike, unicode format_pattern: str) LogicalColumn#

Converting a strings column into timestamps using the provided format pattern.

The format pattern can include the following specifiers: “%Y,%y,%m,%d,%H,%I,%p, %M,%S,%f,%z”.

Please see to_timestamps() for details.

Warning

Invalid formats are not checked, the format pattern must be well defined as per the C++ API.

Parameters:
  • col – Strings instance for this operation

  • timestamp_type – The timestamp type used for creating the output column

  • format_pattern – String specifying the timestamp format in strings

Return type:

New datetime column

Raises:

RuntimeError – if timestamp_type is not a timestamp type.:

legate_dataframe.lib.replace.replace_nulls(LogicalColumn col, replacement: ScalarLike) LogicalColumn#

Return a new column with NULL entries replaced by value.

Parameters:
  • lhs – Operand column

  • replacement – Value to replace NULLs with (currently limited to scalars).

Return type:

Output column of output_type type without NULL entries.

Raises:

ValueError – if the value is not of the correct scalar type.: