Options
All
  • Public
  • Public/Protected
  • All
Menu

Class StringSeries

A Series of utf8-string values in GPU memory.

Hierarchy

Index

Accessors

data

  • Series containing the utf8 characters of each string

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(["foo", "bar"]);

    a.data // Uint8Array(6) [ 102, 111, 111, 98, 97, 114 ]

    Returns Uint8Series

hasNulls

  • get hasNulls(): boolean

length

  • get length(): number

mask

  • get mask(): DeviceBuffer

nullCount

  • get nullCount(): number

nullable

  • get nullable(): boolean

numChildren

  • get numChildren(): number

offset

  • get offset(): number

offsets

type

  • get type(): T

Methods

[iterator]

  • [iterator](): IterableIterator<null | string>
  • Copy the underlying device memory to host, and return an Iterator of the values.

    Returns IterableIterator<null | string>

_castAsBool8

  • _castAsBool8(_memoryResource?: MemoryResource): Bool8Series

_castAsCategorical

  • _castAsCategorical<R>(type: R, memoryResource?: MemoryResource): Series<R>

byteCount

  • byteCount(memoryResource?: MemoryResource): Int32Series
  • Returns an Int32 series the number of bytes of each string in the Series.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['Hello', 'Bye', 'Thanks 😊', null]);

    a.byteCount() // [5, 3, 11, null]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

    Returns Int32Series

cast

  • cast<R>(dataType: R, memoryResource?: MemoryResource): Series<R>
  • Casts the values to a new dtype (similar to static_cast in C++).

    example
    import {Series, Bool8, Int32} from '@rapidsai/cudf';

    const a = Series.new({type:new Int32, data: [1,0,1,0]});

    a.cast(new Bool8); // Bool8Series [true, false, true, false];

    Type parameters

    Parameters

    • dataType: R

      The new dtype.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

    Returns Series<R>

    Series of same size as the current Series containing result of the cast operation.

concat

  • Concat a Series to the end of the caller, returning a new Series of a common dtype.

    example
    import {Series} from '@rapidsai/cudf';

    Series.new([1, 2, 3]).concat(Series.new([4, 5, 6])) // [1, 2, 3, 4, 5, 6]

    Type parameters

    Parameters

    • other: R

      The Series to concat to the end of the caller.

    • Optional memoryResource: MemoryResource

    Returns Series<CommonType<Utf8String, R["type"]>>

containsRe

  • containsRe(pattern: string | RegExp, memoryResource?: MemoryResource): Bool8Series
  • Returns a boolean series identifying rows which match the given regex pattern.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['Finland','Colombia','Florida', 'Russia','france']);

    // items starting with F (only upper case)
    a.containsRe(/^F/) // [true, false, true, false, false]
    // items starting with F or f
    a.containsRe(/^[Ff]/) // [true, false, true, false, true]
    // items ending with a
    a.containsRe("a$") // [false, true, true, true, false]
    // items containing "us"
    a.containsRe("us") // [false, false, false, true, false]

    Parameters

    • pattern: string | RegExp

      Regex pattern to match to each string.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

      The regex pattern strings accepted are described here:

      https://docs.rapids.ai/api/libcudf/stable/md_regex.html

      A RegExp may also be passed, however all flags are ignored (only pattern.source is used)

    Returns Bool8Series

copy

  • Return a copy of this Series.

    example
    import {Series} from '@rapidsai/cudf';

    const a = Series.new(["foo", "bar", "test"]);

    a.copy() // StringSeries ["foo", "bar", "test"]

    Parameters

    • Optional memoryResource: MemoryResource

    Returns StringSeries

countNonNulls

  • countNonNulls(): number
  • Return the number of non-null elements in the Series.

    example
    import {Series} from '@rapidsai/cudf';

    Series.new([1, 2, 3]).countNonNulls(); // 3
    Series.new([1, null, 3]).countNonNulls(); // 2

    Returns number

    The number of non-null elements

countRe

  • countRe(pattern: string | RegExp, memoryResource?: MemoryResource): Int32Series
  • Returns an Int32 series the number of times the given regex pattern matches in each string.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['Finland','Colombia','Florida', 'Russia','france']);

    // count occurences of "o"
    a.countRe(/o/) // [0, 2, 1, 0, 0]
    // count occurences of "an"
    a.countRe('an') // [1, 0, 0, 0, 1]

    // get number of countries starting with F or f
    a.countRe(/^[fF]).count() // 3

    Parameters

    • pattern: string | RegExp

      Regex pattern to match to each string.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

      The regex pattern strings accepted are described here:

      https://docs.rapids.ai/api/libcudf/stable/md_regex.html

      A RegExp may also be passed, however all flags are ignored (only pattern.source is used)

    Returns Int32Series

dispose

  • dispose(): void
  • summary

    Explicitly free the device memory associated with this Series.

    Returns void

dropDuplicates

  • dropDuplicates(keep?: boolean, nullsEqual?: boolean, nullsFirst?: boolean, memoryResource?: MemoryResource): StringSeries
  • Returns a new Series with duplicate values from the original removed

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([4, null, 1, 2, null, 3, 4]).dropDuplicates(
    true,
    true,
    true
    ) // [null, 1, 2, 3, 4]

    Series.new([4, null, 1, 2, null, 3, 4]).dropDuplicates(
    false,
    true,
    true
    ) // [1, 2, 3]

    Parameters

    • keep: boolean = true

      Determines whether or not to keep the duplicate items.

    • nullsEqual: boolean = true

      Determines whether nulls are handled as equal values.

    • nullsFirst: boolean = true

      Determines whether null values are inserted before or after non-null values.

    • Optional memoryResource: MemoryResource

      Memory resource used to allocate the result Column's device memory.

    Returns StringSeries

    series without duplicate values

dropNulls

  • drop Null values from the series

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, undefined, 3]).dropNulls() // [1, 3]
    Series.new([1, null, 3]).dropNulls() // [1, 3]
    Series.new([1, , 3]).dropNulls() // [1, 3]

    // StringSeries
    Series.new(["foo", "bar", undefined]).dropNulls() // ["foo", "bar"]
    Series.new(["foo", "bar", null]).dropNulls() // ["foo", "bar"]
    Series.new(["foo", "bar", ,]).dropNulls() // ["foo", "bar"]

    // Bool8Series
    Series.new([true, true, undefined]).dropNulls() // [true, true]
    Series.new([true, true, null]).dropNulls() // [true, true]
    Series.new([true, true, ,]).dropNulls() // [true, true]

    Parameters

    • Optional memoryResource: MemoryResource

      Memory resource used to allocate the result Column's device memory.

    Returns StringSeries

    series without Null values

encodeLabels

  • encodeLabels<R>(categories?: StringSeries, type?: R, nullSentinel?: R["scalarType"], memoryResource?: MemoryResource): Series<R>
  • Encode the Series values into integer labels.

    Type parameters

    Parameters

    • categories: StringSeries = ...

      The optional Series of values to encode into integers. Defaults to the unique elements in this Series.

    • type: R = ...

      The optional integer DataType to use for the returned Series. Defaults to Uint32.

    • nullSentinel: R["scalarType"] = -1

      The optional value used to indicate missing category. Defaults to -1.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns Series<R>

    A sequence of encoded integer labels with values between 0 and n-1 categories, and nullSentinel for any null values

fill

  • fill(value: string, begin?: number, end?: number, memoryResource?: MemoryResource): StringSeries
  • Fills a range of elements in a column out-of-place with a scalar value.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, 2, 3]).fill(0) // [0, 0, 0]
    // StringSeries
    Series.new(["foo", "bar", "test"]).fill("rplc", 0, 1) // ["rplc", "bar", "test"]
    // Bool8Series
    Series.new([true, true, true]).fill(false, 1) // [true, false, false]

    Parameters

    • value: string

      The scalar value to fill.

    • begin: number = 0

      The starting index of the fill range (inclusive).

    • end: number = ...

      The index of the last element in the fill range (exclusive), default this.length .

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

fillInPlace

  • fillInPlace(value: string, begin?: number, end?: number): StringSeries
  • Fills a range of elements in-place in a column with a scalar value.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, 2, 3]).fillInPlace(0) // [0, 0, 0]
    // StringSeries
    Series.new(["foo", "bar", "test"]).fillInPlace("rplc", 0, 1) // ["rplc", "bar", "test"]
    // Bool8Series
    Series.new([true, true, true]).fillInPlace(false, 1) // [true, false, false]

    Parameters

    • value: string

      The scalar value to fill

    • begin: number = 0

      The starting index of the fill range (inclusive)

    • end: number = ...

      The index of the last element in the fill range (exclusive)

    Returns StringSeries

filter

  • Return a sub-selection of this Series using the specified boolean mask.

    example
    import {Series} from "@rapidsai/cudf";
    const mask = Series.new([true, false, true]);

    // Float64Series
    Series.new([1, 2, 3]).filter(mask) // [1, 3]
    // StringSeries
    Series.new(["foo", "bar", "test"]).filter(mask) // ["foo", "test"]
    // Bool8Series
    Series.new([false, true, true]).filter(mask) // [false, true]

    Parameters

    • mask: Bool8Series

      A Series of boolean values for whose corresponding element in this Series will be selected or ignored.

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns StringSeries

gather

  • summary

    Return sub-selection from a Series using the specified integral indices.

    description

    Gathers the rows of the source columns according to selection, such that row "i" in the resulting Series's columns will contain row selection[i] from the source columns. The number of rows in the result series will be equal to the number of elements in selection. A negative value i in the selection is interpreted as i+n, where n is the number of rows in the source series.

    For dictionary columns, the keys column component is copied and not trimmed if the gather results in abandoned key elements.

    example
    import {Series, Int32} from '@rapidsai/cudf';

    const a = Series.new([1,2,3]);
    const b = Series.new(["foo", "bar", "test"]);
    const c = Series.new([true, false, true]);
    const selection = Series.new({type: new Int32, data: [0,2]});

    a.gather(selection) // Float64Series [1,3]
    b.gather(selection) // StringSeries ["foo", "test"]
    c.gather(selection) // Bool8Series [true, true]

    Parameters

    • indices: number[] | Series<IndexType>

      A Series of 8/16/32-bit signed or unsigned integer indices to gather.

    • nullify_out_of_bounds: boolean = false

      If true, coerce rows that corresponds to out-of-bounds indices in the selection to null. If false, skips all bounds checking for selection values. Pass false if you are certain that the selection contains only valid indices for better performance. If false and there are out-of-bounds indices in the selection, the behavior is undefined. Defaults to false.

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns StringSeries

getJSONObject

  • getJSONObject(jsonPath: string, options?: GetJSONObjectOptions, memoryResource?: MemoryResource): StringSeries
  • Applies a JSONPath(string) where each row in the series is a valid json string. Returns New StringSeries containing the retrieved json object strings

    example
    import {Series} from '@rapidsai/cudf';
    const a = const lines = Series.new([
    {foo: {bar: "baz"}},
    {foo: {baz: "bar"}},
    ].map(JSON.stringify)); // StringSeries ['{"foo":{"bar":"baz"}}', '{"foo":{"baz":"bar"}}']

    a.getJSONObject("$.foo") // StringSeries ['{"bar":"baz"}', '{"baz":"bar"}']
    a.getJSONObject("$.foo.bar") // StringSeries ["baz", null]

    // parse the resulting strings using JSON.parse
    [...a.getJSONObject("$.foo").map(JSON.parse)] // object [{ bar: 'baz' }, { baz: 'bar' }]

    Parameters

    • jsonPath: string

      The JSONPath string to be applied to each row of the input column

    • options: GetJSONObjectOptions = ...
    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

    Returns StringSeries

getValue

  • getValue(index: number): null | string
  • Return a value at the specified index to host memory

    example
    import {Series} from "@rapidsai/cudf";

    // StringSeries
    Series.new(["foo", "bar", "test"]).getValue(0) // "foo"
    Series.new(["foo", "bar", "test"]).getValue(2) // "test"
    Series.new(["foo", "bar", "test"]).getValue(3) // throws index out of bounds error

    Parameters

    • index: number

      the index in this Series to return a value for

    Returns null | string

head

  • Returns the first n rows.

    example
    import {Series} from '@rapidsai/cudf';

    const a = Series.new([4, 6, 8, 10, 12, 1, 2]);
    const b = Series.new(["foo", "bar", "test"]);

    a.head(); // [4, 6, 8, 10, 12]
    b.head(1); // ["foo"]
    a.head(-1); // throws index out of bounds error

    Parameters

    • n: number = 5

      The number of rows to return.

    Returns StringSeries

hexToIntegers

  • hexToIntegers<R>(dataType: R, memoryResource?: MemoryResource): Series<R>
  • Returns a new integer numeric series parsing hexadecimal values.

    Any null entries will result in corresponding null entries in the output series.

    Only characters [0-9] and [A-F] are recognized. When any other character is encountered, the parsing ends for that string. No interpretation is made on the sign of the integer.

    Overflow of the resulting integer type is not checked. Each string is converted using an int64 type and then cast to the target integer type before storing it into the output series. If the resulting integer type is too small to hold the value, the stored value will be undefined.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['04D2', 'FFFFFFFF', '00', '1B', '146D7719', null]);

    a.hexToIntegers() // [1234, -1, 0, 27, 342718233, null]

    Type parameters

    Parameters

    • dataType: R

      Type of integer numeric series to return.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series' device memory.

    Returns Series<R>

ipv4ToIntegers

  • ipv4ToIntegers(memoryResource?: MemoryResource): Int64Series
  • Converts IPv4 addresses into integers.

    The IPv4 format is 1-3 character digits [0-9] between 3 dots (e.g. 123.45.67.890). Each section can have a value between [0-255].

    The four sets of digits are converted to integers and placed in 8-bit fields inside the resulting integer.

    i0.i1.i2.i3 -> (i0 << 24) | (i1 << 16) | (i2 << 8) | (i3)

    No checking is done on the format. If a string is not in IPv4 format, the resulting integer is undefined.

    The resulting 32-bit integer is placed in an int64_t to avoid setting the sign-bit in an int32_t type. This could be changed if cudf supported a UINT32 type in the future.

    Any null entries will result in corresponding null entries in the output column.Returns a new Int64 numeric series parsing hexadecimal values from the provided string series.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['123.255.0.7', '127.0.0.1', null]);

    a.ipv4ToIntegers() // [2080309255n, 2130706433n, null]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series' device memory.

    Returns Int64Series

isHex

  • Returns a boolean column identifying strings in which all characters are valid for conversion to integers from hex.

    The output row entry will be set to true if the corresponding string element has at least one character in [0-9A-Za-z]. Also, the string may start with '0x'.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['123', '-456', '', 'AGE', '0x9EF']);

    a.isHex() // [true, false, false, false, true]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

    Returns Bool8Series

isIpv4

  • Returns a boolean column identifying strings in which all characters are valid for conversion to integers from IPv4 format.

    The output row entry will be set to true if the corresponding string element has the following format xxx.xxx.xxx.xxx where xxx is integer digits between 0-255.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['123.255.0.7', '127.0.0.1', '', '1.2.34', '123.456.789.10', null]);

    a.isIpv4() // [true, true, false, false, false, null]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

    Returns Bool8Series

isNotNull

  • isNotNull(memoryResource?: MemoryResource): Bool8Series
  • Creates a Series of BOOL8 elements where true indicates the value is valid and false indicates the value is null.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, null, 3]).isNotNull() // [true, false, true]
    // StringSeries
    Series.new(["foo", "bar", null]).isNotNull() // [true, true, false]
    // Bool8Series
    Series.new([true, true, null]).isNotNull() // [true, true, false]

    Parameters

    • Optional memoryResource: MemoryResource

      Memory resource used to allocate the result Column's device memory.

    Returns Bool8Series

    A non-nullable Series of BOOL8 elements with false representing null values.

isNull

  • Creates a Series of BOOL8 elements where true indicates the value is null and false indicates the value is valid.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, null, 3]).isNull() // [false, true, false]
    // StringSeries
    Series.new(["foo", "bar", null]).isNull() // [false, false, true]
    // Bool8Series
    Series.new([true, true, null]).isNull() // [false, false, true]

    Parameters

    • Optional memoryResource: MemoryResource

      Memory resource used to allocate the result Column's device memory.

    Returns Bool8Series

    A non-nullable Series of BOOL8 elements with true representing null values.

len

  • Returns an Int32 series the length of each string in the Series.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['dog', '', '\n', null]);

    a.len() // [3, 0, 1 null]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

    Returns Int32Series

matchesRe

  • matchesRe(pattern: string | RegExp, memoryResource?: MemoryResource): Bool8Series
  • Returns a boolean series identifying rows which match the given regex pattern only at the beginning of the string

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['Finland','Colombia','Florida', 'Russia','france']);

    // start of item contains "C"
    a.matchesRe(/C/) // [false, true, false, false, false]
    // start of item contains "us", returns false since none of the items start with "us"
    a.matchesRe('us') // [false, false, false, false, false]

    Parameters

    • pattern: string | RegExp

      Regex pattern to match to each string.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Series's device memory.

      The regex pattern strings accepted are described here:

      https://docs.rapids.ai/api/libcudf/stable/md_regex.html

      A RegExp may also be passed, however all flags are ignored (only pattern.source is used)

    Returns Bool8Series

nLargest

  • nLargest(n?: number, keep?: "any" | "first" | "last" | "none"): StringSeries
  • Returns the n largest element(s).

    example
    import {Series} from '@rapidsai/cudf';

    const a = Series.new([4, 6, 8, 10, 12, 1, 2]);
    const b = Series.new(["foo", "bar", "test"]);

    a.nLargest(); // [12, 10, 8, 6, 4]
    b.nLargest(1); // ["test"]
    a.nLargest(-1); // []

    Parameters

    • n: number = 5

      The number of values to retrieve.

    • keep: "any" | "first" | "last" | "none" = 'first'

      Determines whether to keep the first or last of any duplicate values.

    Returns StringSeries

nSmallest

  • nSmallest(n?: number, keep?: "any" | "first" | "last" | "none"): StringSeries
  • Returns the n smallest element(s).

    example
    import {Series} from '@rapidsai/cudf';

    const a = Series.new([4, 6, 8, 10, 12, 1, 2]);
    const b = Series.new(["foo", "bar", "test"]);

    a.nSmallest(); // [1, 2, 4, 6, 8]
    b.nSmallest(1); // ["bar"]
    a.nSmallest(-1); // []

    Parameters

    • n: number = 5

      The number of values to retrieve.

    • keep: "any" | "first" | "last" | "none" = 'first'

      Determines whether to keep the first or last of any duplicate values.

    Returns StringSeries

orderBy

  • orderBy(ascending?: boolean, null_order?: "after" | "before", memoryResource?: MemoryResource): Int32Series
  • Generate an ordering that sorts the Series in a specified way.

    example
    import {Series, NullOrder} from '@rapidsai/cudf';

    // Float64Series
    Series.new([50, 40, 30, 20, 10, 0]).orderBy(false) // [0, 1, 2, 3, 4, 5]
    Series.new([50, 40, 30, 20, 10, 0]).orderBy(true) // [5, 4, 3, 2, 1, 0]

    // StringSeries
    Series.new(["a", "b", "c", "d", "e"]).orderBy(false) // [4, 3, 2, 1, 0]
    Series.new(["a", "b", "c", "d", "e"]).orderBy(true) // [0, 1, 2, 3, 4]

    // Bool8Series
    Series.new([true, false, true, true, false]).orderBy(true) // [1, 4, 0, 2, 3]
    Series.new([true, false, true, true, false]).orderBy(false) // [0, 2, 3, 1, 4]

    // NullOrder usage
    Series.new([50, 40, 30, 20, 10, null]).orderBy(false, 'before') // [0, 1, 2, 3, 4, 5]
    Series.new([50, 40, 30, 20, 10, null]).orderBy(false, 'after') // [5, 0, 1, 2, 3, 4]

    Parameters

    • ascending: boolean = true

      whether to sort ascending (true) or descending (false)

    • null_order: "after" | "before" = 'after'

      whether nulls should sort before or after other values

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns Int32Series

    Series containting the permutation indices for the desired sort order

pad

  • Add padding to each string using a provided character.

    If the string is already width or more characters, no padding is performed. No strings are truncated.

    Null string entries result in null entries in the output column.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['aa','bbb','cccc','ddddd', null]);

    a.pad(4) // ['aa ','bbb ','cccc','ddddd', null]

    Parameters

    • width: number

      The minimum number of characters for each string.

    • side: PadSideType = 'right'

      Where to place the padding characters. Default is pad right (left justify).

    • fill_char: string = ' '

      Single UTF-8 character to use for padding. Default is the space character.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

partition

  • Returns a set of 3 columns by splitting each string using the specified delimiter.

    The number of rows in the output columns will be the same as the input column. The first column will contain the first tokens of each string as a result of the split. The second column will contain the delimiter. The third column will contain the remaining characters of each string after the delimiter.

    Any null string entries return corresponding null output columns.

    note

    If delimiter is omitted, the default is ''.

    example
    import {DataFrame, Series} from '@rapidsai/cudf';

    const strs = Series.new(["a_b", "c_d"]);
    const [before, delim, after] = strs.partition('_');

    new DataFrame({ before, delim, after }).toString();
    // before delim after
    // a _ b
    // c _ d

    Parameters

    • delimiter: string = ''

      UTF-8 encoded string indicating where to split each string. Default of empty string indicates split on whitespace.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns [StringSeries, StringSeries, StringSeries]

    3 new string columns representing before the delimiter, the delimiter, and after the delimiter.

replaceNulls

  • Replace null values with a scalar value.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, null, 3]).replaceNulls(-1) // [1, -1, 3]
    // StringSeries
    Series.new(["foo", "bar", null]).replaceNulls("rplc") // ["foo", "bar", "rplc"]
    // Bool8Series
    Series.new([null, true, true]).replaceNulls(false) // [true, true, true]

    Parameters

    • value: any

      The scalar value to use in place of nulls.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

  • Replace null values with the corresponding elements from another Series.

    example
    import {Series} from '@rapidsai/cudf';
    const replace = Series.new([10, 10, 10]);
    const replaceBool = Series.new([false, false, false]);

    // Float64Series
    Series.new([1, null, 3]).replaceNulls(replace) // [1, 10, 3]
    // StringSeries
    Series.new(["foo", "bar", null]).replaceNulls(replace) // ["foo", "bar", "10"]
    // Bool8Series
    Series.new([null, true, true]).replaceNulls(replaceBool) // [false, true, true]

    Parameters

    • value: StringSeries

      The Series to use in place of nulls.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

replaceNullsFollowing

  • replaceNullsFollowing(memoryResource?: MemoryResource): StringSeries
  • Replace null values with the non-null value following the null value in the same series.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, null, 3]).replaceNullsFollowing() // [1, 3, 3]
    // StringSeries
    Series.new(["foo", "bar", null]).replaceNullsFollowing() // ["foo", "bar", null]
    Series.new(["foo", null, "bar"]).replaceNullsFollowing() // ["foo", "bar", "bar"]
    // Bool8Series
    Series.new([null, true, true]).replaceNullsFollowing() // [true, true, true]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

replaceNullsPreceding

  • replaceNullsPreceding(memoryResource?: MemoryResource): StringSeries
  • Replace null values with the non-null value preceding the null value in the same series.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, null, 3]).replaceNullsPreceding() // [1, 1, 3]
    // StringSeries
    Series.new([null, "foo", "bar"]).replaceNullsPreceding() // [null, "foo", "bar"]
    Series.new(["foo", null, "bar"]).replaceNullsPreceding() // ["foo", "foo", "bar"]
    // Bool8Series
    Series.new([true, null, false]).replaceNullsPreceding() // [true, true, false]

    Parameters

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

replaceRe

  • replaceRe(pattern: RegExp, replacement?: string, maxReplaceCount?: number, memoryResource?: MemoryResource): StringSeries
  • For each string in the column, replaces any character sequence matching the given pattern with the provided replacement string.

    Null string entries will return null output string entries.

    Position values are 0-based meaning position 0 is the first character of each string.

    This function can be used to insert a string into specific position by specifying the same position value for start and stop. The repl string can be appended to each string by specifying -1 for both start and stop.

    Parameters

    • pattern: RegExp

      The regular expression pattern to search within each string.

    • replacement: string = ''

      The string used to replace the matched sequence in each string. Default is an empty string.

    • maxReplaceCount: number = -1

      The maximum number of times to replace the matched pattern within each string. Default replaces every substring that is matched.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

    New strings column with matching elements replaced.

replaceSlice

  • replaceSlice(repl: string, start: number, stop: number, memoryResource?: MemoryResource): StringSeries
  • Replaces each string in the column with the provided repl string within the [start,stop) character position range.

    Null string entries will return null output string entries.

    Position values are 0-based meaning position 0 is the first character of each string.

    This function can be used to insert a string into specific position by specifying the same position value for start and stop. The repl string can be appended to each string by specifying -1 for both start and stop.

    Parameters

    • repl: string

      Replacement string for specified positions found.

    • start: number

      Start position where repl will be added. Default is 0, first character position.

    • stop: number

      End position (exclusive) to use for replacement. Default of -1 specifies the end of each string.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

reverse

  • Returns a new series with reversed elements.

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([1, 2, 3]).reverse() // [3, 2, 1]
    // StringSeries
    Series.new(["foo", "bar"]).reverse() // ["bar", "foo"]
    // Bool8Series
    Series.new([false, true]).reverse() // [true, false]

    Parameters

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns StringSeries

scatter

  • Scatters single value into this Series according to provided indices.

    example
    import {Series, Int32} from '@rapidsai/cudf';
    const a = Series.new({type: new Int32, data: [0, 1, 2, 3, 4]});
    const indices = Series.new({type: new Int32, data: [2, 4]});
    const indices_out_of_bounds = Series.new({type: new Int32, data: [5, 6]});

    a.scatter(-1, indices); // returns [0, 1, -1, 3, -1];
    a.scatter(-1, indices_out_of_bounds, true) // throws index out of bounds error

    Parameters

    • value: string

      A column of values to be scattered in to this Series

    • indices: number[] | Series<IndexType>

      A column of integral indices that indicate the rows in the this Series to be replaced by value.

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns StringSeries

  • Scatters a column of values into this Series according to provided indices.

    example
    import {Series, Int32} from '@rapidsai/cudf';
    const a = Series.new({type: new Int32, data: [0, 1, 2, 3, 4]});
    const b = Series.new({type: new Int32, data: [200, 400]});
    const indices = Series.new({type: new Int32, data: [2, 4]});
    const indices_out_of_bounds = Series.new({type: new Int32, data: [5, 6]});

    a.scatter(b, indices); // returns [0, 1, 200, 3, 400];
    a.scatter(b, indices_out_of_bounds, true) // throws index out of bounds error

    Parameters

    • values: StringSeries
    • indices: number[] | Series<IndexType>

      A column of integral indices that indicate the rows in the this Series to be replaced by value.

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns StringSeries

setNullMask

  • setNullMask(mask: MemoryData | ArrayLike<number> | ArrayLike<bigint>, nullCount?: number): void
  • Parameters

    • mask: MemoryData | ArrayLike<number> | ArrayLike<bigint>

      The null-mask. Valid values are marked as 1; otherwise 0. The mask bit given the data index idx is computed as:

      (mask[idx // 8] >> (idx % 8)) & 1
      
    • Optional nullCount: number

      The number of null values. If None, it is calculated automatically.

    Returns void

setValue

  • setValue(index: number, value: string): void
  • set value at the specified index

    example
    import {Series} from "@rapidsai/cudf";

    // StringSeries
    const a = Series.new(["foo", "bar", "test"])
    a.setValue(2, "test1") // inplace update -> Series(["foo", "bar", "test1"])

    Parameters

    • index: number

      the index in this Series to set a value for

    • value: string

      the value to set at index

    Returns void

setValues

  • set values at the specified indices

    example
    import {Series, Int32} from '@rapidsai/cudf';
    const a = Series.new({type: new Int32, data: [0, 1, 2, 3, 4]});
    const values = Series.new({type: new Int32, data: [200, 400]});
    const indices = Series.new({type: new Int32, data: [2, 4]});

    a.setValues(indices, values); // inplace update [0, 1, 200, 3, 400];
    a.setValues(indices, -1); // inplace update [0, 1, -1, 3, -1];

    Parameters

    • indices: number[] | Int32Series

      the indices in this Series to set values for

    • values: string | StringSeries

      the values to set at Series of indices

    Returns void

sortValues

  • sortValues(ascending?: boolean, null_order?: "after" | "before", memoryResource?: MemoryResource): StringSeries
  • Generate a new Series that is sorted in a specified way.

    example
    import {Series, NullOrder} from '@rapidsai/cudf';

    // Float64Series
    Series.new([50, 40, 30, 20, 10, 0]).sortValues(false) // [50, 40, 30, 20, 10, 0]
    Series.new([50, 40, 30, 20, 10, 0]).sortValues(true) // [0, 10, 20, 30, 40, 50]

    // StringSeries
    Series.new(["a", "b", "c", "d", "e"]).sortValues(false) // ["e", "d", "c", "b", "a"]
    Series.new(["a", "b", "c", "d", "e"]).sortValues(true) // ["a", "b", "c", "d", "e"]

    // Bool8Series
    Series.new([true, false, true, true, false]).sortValues(true) // [false, false, true,
    true, true] Series.new([true, false, true, true, false]).sortValues(false) // [true,
    true, true, false, false]

    // NullOrder usage
    Series.new([50, 40, 30, 20, 10, null]).sortValues(false, 'before') // [50, 40, 30, 20,
    10, null]

    Series.new([50, 40, 30, 20, 10, null]).sortValues(false, 'after') // [null, 50, 40, 30,
    20, 10]

    Parameters

    • ascending: boolean = true

      whether to sort ascending (true) or descending (false) Default: true

    • null_order: "after" | "before" = 'after'

      whether nulls should sort before or after other values Default: before

    • Optional memoryResource: MemoryResource

      An optional MemoryResource used to allocate the result's device memory.

    Returns StringSeries

    Sorted values

split

  • split(delimiter?: string, memoryResource?: MemoryResource): StringSeries
  • Splits a StringSeries along the delimiter.

    note

    If delimiter is omitted, the default is ''.

    Parameters

    • delimiter: string = ''

      Optional delimiter.

    • Optional memoryResource: MemoryResource

    Returns StringSeries

    Series with new splits determined by the delimiter.

tail

  • Returns the last n rows.

    example
    import {Series} from '@rapidsai/cudf';

    const a = Series.new([4, 6, 8, 10, 12, 1, 2]);
    const b = Series.new(["foo", "bar", "test"]);

    a.tail(); // [8, 10, 12, 1, 2]
    b.tail(1); // ["test"]
    a.tail(-1); // throws index out of bounds error

    Parameters

    • n: number = 5

      The number of rows to return.

    Returns StringSeries

toArray

  • toArray(): Uint8Array
  • Copy the underlying device memory to host and return an Array (or TypedArray) of the values.

    Returns Uint8Array

toArrow

toString

  • toString(options?: DisplayOptions & { name?: string }): string
  • Return a string with a tabular representation of the Series, pretty-printed according to the options given.

    Parameters

    • options: DisplayOptions & { name?: string } = {}

    Returns string

unique

  • unique(nullsEqual?: boolean, memoryResource?: MemoryResource): StringSeries
  • Returns a new Series with only the unique values that were found in the original

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series
    Series.new([null, null, 1, 2, 3, 3]).unique(true) // [null, 1, 2, 3]
    Series.new([null, null, 1, 2, 3, 3]).unique(false) // [null, null, 1, 2, 3]

    Parameters

    • nullsEqual: boolean = true

      Determines whether nulls are handled as equal values.

    • Optional memoryResource: MemoryResource

      Memory resource used to allocate the result Column's device memory.

    Returns StringSeries

    series without duplicate values

valueCounts

  • Returns an object with keys "value" and "count" whose respective values are new Series containing the unique values in the original series and the number of times they occur in the original series.

    Returns { count: Int32Series; value: StringSeries }

    object with keys "value" and "count"

zfill

  • zfill(width: number, memoryResource?: MemoryResource): StringSeries
  • Add '0' as padding to the left of each string.

    If the string is already width or more characters, no padding is performed. No strings are truncated.

    This equivalent to pad(width, 'left', '0') but is more optimized for this special case.

    Null string entries result in null entries in the output column.

    example
    import {Series} from '@rapidsai/cudf';
    const a = Series.new(['1234','-9876','+0.34','-342567', null]);

    a.zfill(6) // ['001234','0-9876','0+0.34','-342567', null]

    Parameters

    • width: number

      The minimum number of characters for each string.

    • Optional memoryResource: MemoryResource

      The optional MemoryResource used to allocate the result Column's device memory.

    Returns StringSeries

Static concatenate

  • Row-wise concatenates the given list of strings series and returns a single string series result.

    example
    import {StringSeries} from '@rapidsai/cudf';
    const s = StringSeries.new(['a', 'b', null])
    const t = StringSeries.new(['foo', null, 'bar'])
    [...StringSeries.concatenate([s, t])] // ["afoo", null, null]

    Parameters

    • series: StringSeries[]

      List of string series to concatenate.

    • opts: ConcatenateOptions = {}

      Options for the concatenation

    Returns StringSeries

    New series with concatenated results.

Static new

  • Create a new cudf.Series from an apache arrow vector

    example
    import {Series, Int32} from '@rapidsai/cudf';
    import * as arrow from 'apache-arrow';

    const arrow_vec = arrow.vectorFromArray(new Int32Array([1,2,3,4])));
    const a = Series.new(arrow_vec); // Int32Series [1, 2, 3, 4]

    const arrow_vec_list = arrow.vectorFromArray(
    [[0, 1, 2], [3, 4, 5], [6, 7, 8]],
    new arrow.List(arrow.Field.new({ name: 'ints', type: new arrow.Int32 })),
    );

    const b = Series.new(arrow_vec_list) // ListSeries [[0, 1, 2], [3, 4, 5], [6, 7, 8]]

    const arrow_vec_struct = arrow.vectorFromArray(
    [{ x: 0, y: 3 }, { x: 1, y: 4 }, { x: 2, y: 5 }],
    new arrow.Struct([
    arrow.Field.new({ name: 'x', type: new arrow.Int32 }),
    arrow.Field.new({ name: 'y', type: new arrow.Int32 })
    ]),
    );

    const c = Series.new(arrow_vec_struct);
    // StructSeries [{ x: 0, y: 3 }, { x: 1, y: 4 }, { x: 2, y: 5 }]

    Type parameters

    • T: Vector<any, T>

    Parameters

    • input: T

    Returns Series<ArrowToCUDFType<T["type"]>>

  • Create a new cudf.Series from SeriesProps or a cudf.Column

    example
    import {Series, Int32} from '@rapidsai/cudf';

    //using SeriesProps
    const a = Series.new({type: new Int32, data: [1, 2, 3, 4]}); // Int32Series [1, 2, 3, 4]

    //using underlying cudf.Column
    const b = Series.new(a._col); // Int32Series [1, 2, 3, 4]

    Type parameters

    Parameters

    • input: T

    Returns T

  • Type parameters

    Parameters

    Returns Series<T>

  • Type parameters

    Parameters

    Returns Series<T>

  • Create a new cudf.Int8Series

    example
    import {
    Series,
    Int8Series,
    Int8
    } from '@rapidsai/cudf';

    // Int8Series [1, 2, 3]
    const a = Series.new(new Int8Array([1, 2, 3]));
    const b = Series.new(new Int8Buffer([1, 2, 3]));

    Parameters

    Returns Int8Series

  • Create a new cudf.Int16Series

    example
    import {
    Series,
    Int16Series,
    Int16
    } from '@rapidsai/cudf';

    // Int16Series [1, 2, 3]
    const a = Series.new(new Int16Array([1, 2, 3]));
    const b = Series.new(new Int16Buffer([1, 2, 3]));

    Parameters

    Returns Int16Series

  • Create a new cudf.Int32Series

    example
    import {
    Series,
    Int32Series,
    Int32
    } from '@rapidsai/cudf';

    // Int32Series [1, 2, 3]
    const a = Series.new(new Int32Array([1, 2, 3]));
    const b = Series.new(new Int32Buffer([1, 2, 3]));

    Parameters

    Returns Int32Series

  • Create a new cudf.Uint8Series

    example
    import {
    Series,
    Uint8Series,
    Uint8
    } from '@rapidsai/cudf';

    // Uint8Series [1, 2, 3]
    const a = Series.new(new Uint8Array([1, 2, 3]));
    const b = Series.new(new Uint8Buffer([1, 2, 3]));
    const c = Series.new(new Uint8ClampedArray([1, 2, 3]));
    const d = Series.new(new Uint8ClampedBuffer([1, 2, 3]));

    Parameters

    Returns Uint8Series

  • Create a new cudf.Uint16Series

    example
    import {
    Series,
    Uint16Series,
    Uint16
    } from '@rapidsai/cudf';

    // Uint16Series [1, 2, 3]
    const a = Series.new(new Uint16Array([1, 2, 3]));
    const b = Series.new(new Uint16Buffer([1, 2, 3]));

    Parameters

    Returns Uint16Series

  • Create a new cudf.Uint32Series

    example
    import {
    Series,
    Uint32Series,
    Uint32
    } from '@rapidsai/cudf';

    // Uint32Series [1, 2, 3]
    const a = Series.new(new Uint32Array([1, 2, 3]));
    const b = Series.new(new Uint32Buffer([1, 2, 3]));

    Parameters

    Returns Uint32Series

  • Create a new cudf.Uint64Series

    example
    import {
    Series,
    Uint64Series,
    Uint64
    } from '@rapidsai/cudf';

    // Uint64Series [1n, 2n, 3n]
    const a = Series.new(new BigUint64Array([1n, 2n, 3n]));
    const b = Series.new(new Uint64Buffer([1n, 2n, 3n]));

    Parameters

    Returns Uint64Series

  • Create a new cudf.Float32Series

    example
    import {
    Series,
    Float32Series,
    Float32
    } from '@rapidsai/cudf';

    // Float32Series [1, 2, 3]
    const a = Series.new(new Float32Array([1, 2, 3]));
    const b = Series.new(new Float32Buffer([1, 2, 3]));

    Parameters

    Returns Float32Series

  • Create a new cudf.StringSeries

    example
    import {Series} from '@rapidsai/cudf';

    // StringSeries ["foo", "bar", "test", null]
    const a = Series.new(["foo", "bar", "test", null]);

    Parameters

    • input: (undefined | null | string)[]

    Returns StringSeries

  • Create a new cudf.Float64Series

    example
    import {Series} from '@rapidsai/cudf';

    // Float64Series [1, 2, 3, null, 4]
    const a = Series.new([1, 2, 3, undefined, 4]);

    Parameters

    • input: Float64Buffer | Float64Array | (undefined | null | number)[]

    Returns Float64Series

  • Create a new cudf.Int64Series

    example
    import {Series} from '@rapidsai/cudf';

    // Int64Series [1n, 2n, 3n, null, 4n]
    const a = Series.new([1n, 2n, 3n, undefined, 4n]);

    Parameters

    • input: Int64Buffer | BigInt64Array | (undefined | null | bigint)[]

    Returns Int64Series

  • Create a new cudf.Bool8Series

    example
    import {Series} from '@rapidsai/cudf';

    // Bool8Series [true, false, null, false]
    const a = Series.new([true, false, undefined, false]);

    Parameters

    • input: (undefined | null | boolean)[]

    Returns Bool8Series

  • Create a new cudf.TimestampMillisecondSeries

    example
    import {Series} from '@rapidsai/cudf';

    // TimestampMillisecondSeries [2021-05-13T00:00:00.000Z, null, 2021-05-13T00:00:00.000Z,
    null] const a = Series.new([new Date(), undefined, new Date(), undefined]);

    Parameters

    • input: (undefined | null | Date)[]

    Returns TimestampMillisecondSeries

  • Create a new cudf.ListSeries that contain cudf.StringSeries elements.

    example
    import {Series} from '@rapidsai/cudf';

    // ListSeries [["foo", "bar"], ["test", null]]
    const a = Series.new([["foo", "bar"], ["test",null]]);
    a.getValue(0) // StringSeries ["foo", "bar"]
    a.getValue(1) // StringSeries ["test", null]

    Parameters

    • input: (undefined | null | string)[][]

    Returns ListSeries<Utf8String>

  • Create a new cudf.ListSeries that contain cudf.Float64Series elements.

    example
    import {Series} from '@rapidsai/cudf';

    // ListSeries [[1, 2], [3, null, 4]]
    const a = Series.new([[1, 2], [3, undefined, 4]]);
    a.getValue(0) // Float64Series [1, 2]
    a.getValue(1) // Float64Series [3, null, 4]

    Parameters

    • input: (undefined | null | number)[][]

    Returns ListSeries<Float64>

  • Create a new cudf.ListSeries that contain cudf.Int64Series elements.

    example
    import {Series} from '@rapidsai/cudf';

    // ListSeries [[1n, 2n], [3n, null, 4n]]
    const a = Series.new([[1n, 2n], [3n, undefined, 4n]]);
    a.getValue(0) // Int64Series [1n, 2n]
    a.getValue(1) // Int64Series [3n, null, 4n]

    Parameters

    • input: (undefined | null | bigint)[][]

    Returns ListSeries<Int64>

  • Create a new cudf.ListSeries that contain cudf.Bool8Series elements.

    example
    import {Series} from '@rapidsai/cudf';

    // ListSeries [[true, false], [null, false]]
    const a = Series.new([[true, false], [undefined, false]]);
    a.getValue(0) // Bool8Series [true, false]
    a.getValue(1) // Bool8Series [null, false]

    Parameters

    • input: (undefined | null | boolean)[][]

    Returns ListSeries<Bool8>

  • Create a new cudf.ListSeries that contain cudf.TimestampMillisecondSeries elements.

    example
    import {Series} from '@rapidsai/cudf';

    // ListSeries [[2021-05-13T00:00:00.000Z, null], [null, 2021-05-13T00:00:00.000Z]]
    const a = Series.new([[new Date(), undefined], [undefined, new Date()]]);
    a.getValue(0) // TimestampMillisecondSeries [2021-05-13T00:00:00.000Z, null]
    a.getValue(1) // TimestampMillisecondSeries [null, 2021-05-13T00:00:00.000Z]

    Parameters

    • input: (undefined | null | Date)[][]

    Returns ListSeries<TimestampMillisecond>

  • Type parameters

    • T: readonly unknown[]

    Parameters

    • input: T

    Returns Series<ArrowToCUDFType<JavaScriptArrayDataType<T>>>

  • Type parameters

    Parameters

    • input: AbstractSeries<T> | Column<T> | SeriesProps<T> | Vector<T> | (undefined | null | string)[] | (undefined | null | number)[] | (undefined | null | bigint)[] | (undefined | null | boolean)[] | (undefined | null | Date)[] | (undefined | null | string)[][] | (undefined | null | number)[][] | (undefined | null | bigint)[][] | (undefined | null | boolean)[][] | (undefined | null | Date)[][]

    Returns Series<T>

Static readText

  • Constructs a Series from a text file path.

    note

    If delimiter is omitted, the default is ''.

    example
    import {Series} from '@rapidsai/cudf';

    const infile = Series.readText('./inputAsciiFile.txt')

    Parameters

    • filepath: string

      Path of the input file.

    • delimiter: string

      Optional delimiter.

    Returns StringSeries

    StringSeries from the file, split by delimiter.

Static sequence

  • sequence<U>(opts: { init?: U["scalarType"]; memoryResource?: MemoryResource; size: number; step?: U["scalarType"]; type?: U }): Series<U>
  • Constructs a Series with a sequence of values.

    note

    If init is omitted, the default is 0.

    note

    If step is omitted, the default is 1.

    note

    If type is omitted, the default is Int32.

    example
    import {Series, Int64, Float32} from '@rapidsai/cudf';

    Series.sequence({size: 5}).toArray() // Int32Array[0, 1, 2, 3, 4]
    Series.sequence({size: 5, init: 5}).toArray() // Int32Array[5, 6, 7, 8, 9]
    Series
    .sequence({ size: 5, init: 0, type: new Int64 })
    .toArray() // BigInt64Array[0n, 1n, 2n, 3n, 4n]
    Series
    .sequence({ size: 5, step: 2, init: 1, type: new Float32 })
    .toArray() // Float32Array[1, 3, 5, 7, 9]

    Type parameters

    Parameters

    • opts: { init?: U["scalarType"]; memoryResource?: MemoryResource; size: number; step?: U["scalarType"]; type?: U }

      Options for creating the sequence

      • Optional init?: U["scalarType"]
      • Optional memoryResource?: MemoryResource
      • size: number
      • Optional step?: U["scalarType"]
      • Optional type?: U

    Returns Series<U>

    Series with the sequence