Polars support#

Legate-dataframe has support for integrating with polars. To enable this support it is currently necessary to include:

import legate_dataframe.ldf_polars  # noqa: F401

This will enable the ability call:

polars_lazy_frame.legate.collect()

A minimal example may be:

q = pl.scan_csv("mydata.csv")
q = q.with_columns(...)  # work with q
legate_result = q.legate.collect()

which executes the polars query to return a legate-dataframe ~legate_dataframe.lib.core.table.LogicalTable.

As of now, we do not hook into polars’ collect, nor does legate-dataframe’s collect currently fall back to polars when an operation is unsupported. The solution will either fully execute in a distributed manner within legate-dataframe or an error will be raised.

If you wish to convert a LogicalTable to a polars.Dataframe you may do so via polars.from_arrow(legate_table.to_arrow()). A LogicalTable may also be converted to a polars LazyFrame via LogicalTable.lazy(), however, such a LazyFrame can only be collected via .legate.collect().

Note

The exact integration path may change in the future to use the polars engine more like cudf-polars. The current approach exists mainly to allow the return of a LogicalTable object.