Sometimes the Python Polars library upgrades.
The following are some codebase migrations I've used.
- Find+Replace:
pl.Utf8 -> pl.String
- Find+Replace
how='outer_coalesce'on joins
Find:
how\s*=\s*['"]outer_coalesce['"]\s*,
Replace:
how='full', coalesce=True,
- Locate
.replace(..., return_dtype=something)
rg -U -t py '\.replace\([^)]+return_dtype'- Non-breaking change with
infer_schema_length(from 0 to False to read all as String):
Find: infer_schema_length\s*=\s*0
Replace: infer_schema_length=False
- Find
pl.read_csv, and pinengine='xlsx2csv'arg (or migrate manually). - Find and Replace all
.frame_equal(to.equals(. - Note that
df.drop(['column_that_does_not_exist'])now raises an exception.
Not sure which version each of these broke/are required, but:
\w+\.fill_null\((pl\.lit\()?0->df.with_columns(pl.selectors.numeric(pl.lit(0))
- Note: Requires regex rework.a
- Find and replace
join_nulls=tonulls_equal= - Find and replace
.collect(streaming=Trueto.collect(engine="streaming" - Selectively find and replace
.str.concat()to.str.join(delimiter="|") - Fix
lf.melt - Find and replace
.str.concat(to.str.join( - Find
\.is_in.+(unique|[a-zA-Z]+\[). Add.implode()to the argument of.is_in(arg), ifargis pl.Series orpl.Expr.
- Any time
.map_elementsis used inside a group_by's.agg(...), you must now implode. Issue
# ripgrep command to very-generously find potential instances.
rg --multiline --multiline-dotall --glob '*.py' '\.group_by\s*\([^)]*\)[\s\S]*?\.agg\s*\([\s\S]{0,200}\.map_elements\s*\(' . -l