Create a new cudf.DataFrame
Create a new cudf.DataFrame
Create a new cudf.DataFrame
The names of columns in this DataFrame
The number of columns in this DataFrame
The number of rows in each column of this DataFrame
A map of this DataFrame's Series names to their DataTypes
Casts each selected Series in this DataFrame to a new dtype (similar to static_cast
in C++).
The map from column names to new dtypes.
The optional MemoryResource used to allocate the result Series's device memory.
DataFrame of Series cast to the new dtype
Casts all the Series in this DataFrame to a new dtype (similar to static_cast
in C++).
The new dtype.
The optional MemoryResource used to allocate the result Series's device memory.
DataFrame of Series cast to the new dtype make notebooks.run a: Series.new({type: new Int32, data: [0, 1, 1, 2, 2, 2]}), b: Series.new({type: new Int32, data: [0, 1, 2, 3, 4, 4]}) })
df.castAll(new Float32); // returns df with a and b as Float32Series
Concat DataFrame(s) to the end of the caller, returning a new DataFrame.
The DataFrame(s) to concat to the end of the caller.
Return a new DataFrame with specified columns removed.
Names of the columns to drop.
Drops duplicate rows from a DataFrame
Determines whether to keep the first, last, or none of the duplicate items.
Determines whether nulls are handled as equal values.
Determines whether null values are inserted before or after non-null values.
List of columns to consider when dropping rows (all columns are considered by default).
Memory resource used to allocate the result Column's device memory.
a DataFrame without duplicate rows
Drops rows (or columns) containing NaN, provided the columns are of type float
Whether to drop rows (axis=0, default) or columns (axis=1) containing NaN
drops every row (or column) containing less than thresh non-NaN values.
thresh=1 (default) drops rows (or columns) containing all NaN values (non-NaN < thresh(1)).
if axis = 0, thresh=df.numColumns: drops only rows containing at-least one NaN value (non-NaN values in a row < thresh(df.numColumns)).
if axis = 1, thresh=df.numRows: drops only columns containing at-least one NaN values (non-NaN values in a column < thresh(df.numRows)).
List of float columns to consider when dropping rows (all float columns are
considered by default).
Alternatively, when dropping columns, subset is a Series
DataFrame
Drops rows (or columns) containing nulls (*Note: only null values are dropped and not NaNs)
Whether to drop rows (axis=0, default) or columns (axis=1) containing nulls
drops every row (or column) containing less than thresh non-null values.
thresh=1 (default) drops rows (or columns) containing all null values (non-null < thresh(1)).
if axis = 0, thresh=df.numColumns: drops only rows containing at-least one null value (non-null values in a row < thresh(df.numColumns)).
if axis = 1, thresh=df.numRows: drops only columns containing at-least one null values (non-null values in a column < thresh(df.numRows)).
List of columns to consider when dropping rows (all columns are considered by
default).
Alternatively, when dropping columns, subset is a Series
DataFrame
Return sub-selection from a DataFrame from the specified boolean mask.
Names of List Columns to flatten. Defaults to all list Columns.
An optional MemoryResource used to allocate the result's device memory.
Names of List Columns to flatten. Defaults to all list Columns.
An optional MemoryResource used to allocate the result's device memory.
A Series of 8/16/32-bit signed or unsigned integer indices to gather.
If true
, coerce rows that corresponds to out-of-bounds indices
in the selection to null. If false
, skips all bounds checking for selection values. Pass
false if you are certain that the selection contains only valid indices for better
performance. If false
and there are out-of-bounds indices in the selection, the behavior
is undefined. Defaults to false
.
An optional MemoryResource used to allocate the result's device memory.
Return a series by name.
Name of the Series to return.
Return a group-by on a single column.
configuration for the groupby
Return a group-by on a multiple columns.
configuration for the groupby
Return whether the DataFrame has a Series.
Name of the Series to return.
Returns the first n rows as a new DataFrame.
The number of rows to return.
The dtype of the result Series (required if the DataFrame has mixed dtypes).
An optional MemoryResource used to allocate the result's device memory.
Series representing a packed row-major matrix of all the source DataFrame's Series.
Creates a DataFrame replacing any FloatSeries with a Bool8Series where true
indicates the
value is NaN
and false
indicates the value is valid.
a DataFrame replacing instances of FloatSeries with a Bool8Series where true
indicates the value is NaN
Creates a DataFrame replacing any FloatSeries with a Bool8Series where false
indicates the
value is NaN
and true
indicates the value is valid.
a DataFrame replacing instances of FloatSeries with a Bool8Series where false
indicates the value is NaN
Creates a DataFrame of BOOL8
Series where true
indicates the value is null and
false
indicates the value is valid.
a DataFrame containing Series of 'BOOL8' where 'true' indicates the value is null
Join columns with other DataFrame.
the configuration for the join
the joined DataFrame
Join columns with other DataFrame.
the configuration for the join
the joined DataFrame
Return a Series containing the unbiased kurtosis result for each Series in the DataFrame.
Exclude NA/null values. If an entire row/column is NA, the result will be NA.
A Series containing the unbiased kurtosis result for all Series in the DataFrame
Convert NaNs (if any) to nulls.
List of float columns to consider to replace NaNs with nulls.
DataFrame
Generate an ordering that sorts DataFrame columns in a specified way
mapping of column names to sort order specifications
An optional MemoryResource used to allocate the result's device memory.
Series containting the permutation indices for the desired sort order
Return a new DataFrame with specified columns renamed.
Object mapping old to new Column names.
Replace null values with a value.
The scalar value to use in place of nulls.
The optional MemoryResource used to allocate the result Column's device memory.
Replace null values with the corresponding elements from another Map of Series.
The map of Series to use in place of nulls.
The optional MemoryResource used to allocate the result Column's device memory.
Return a new DataFrame containing only specified columns.
Return a Series containing the unbiased skew result for each Series in the DataFrame.
Exclude NA/null values. If an entire row/column is NA, the result will be NA.
A Series containing the unbiased skew result for all Series in the DataFrame
Compute the sum for all Series in the DataFrame.
List of columns to select (all columns are considered by default).
The optional skipNulls if true drops NA and null values before computing reduction, else if skipNulls is false, reduction is computed directly.
Memory resource used to allocate the result Column's device memory.
A Series containing the sum of all values for each Series
Returns the last n rows as a new DataFrame.
The number of rows to return.
Copy a Series to an Arrow vector in host memory
Serialize this DataFrame to CSV format.
Options controlling CSV writing behavior.
A node ReadableStream of the CSV data.
Write a DataFrame to ORC format.
File path or root directory path.
Options controlling ORC writing behavior.
Write a DataFrame to Parquet format.
File path or root directory path.
Options controlling Parquet writing behavior.
Return a string with a tabular representation of the DataFrame, pretty-printed according to the options given.
Read a CSV file from disk and create a cudf.DataFrame
Read a CSV file from disk and create a cudf.DataFrame
Read Apache ORC files from disk and create a cudf.DataFrame
Read Apache ORC files from disk and create a cudf.DataFrame
Read Apache Parquet files from disk and create a cudf.DataFrame
Read Apache Parquet files from disk and create a cudf.DataFrame
A GPU Dataframe object.