Polars support#
Legate-dataframe has support for integrating with polars. To enable this support it is currently necessary to include:
import legate_dataframe.ldf_polars # noqa: F401
This will enable the ability call:
polars_lazy_frame.legate.collect()
A minimal example may be:
q = pl.scan_csv("mydata.csv")
q = q.with_columns(...) # work with q
legate_result = q.legate.collect()
which executes the polars query to return a legate-dataframe ~legate_dataframe.lib.core.table.LogicalTable.
As of now, we do not hook into polars’ collect
, nor does
legate-dataframe’s collect
currently fall back to polars when an
operation is unsupported.
The solution will either fully execute in a distributed manner within
legate-dataframe or an error will be raised.
If you wish to convert a LogicalTable
to a polars.Dataframe
you may do so via polars.from_arrow(legate_table.to_arrow())
.
A LogicalTable
may also be converted to a polars LazyFrame
via
LogicalTable.lazy()
, however, such a LazyFrame
can only be
collected via .legate.collect()
.
Note
The exact integration path may change in the future to use
the polars engine more like cudf-polars
.
The current approach exists mainly to allow the return of a
LogicalTable
object.