DataFrame Expressions
The DataFrame.Expr module provides composable, type-safe column expressions that compile directly to Polars operations. They always use Polars' optimized SIMD and parallel execution.
Getting Started
Import the Expr module with an alias for concise usage:
import DataFrame
import DataFrame.Expr as Expr
Column References and Literals
Build expressions from column references and literal values:
-- DataFrame.Expr for composable column operations
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "Alice", revenue = 100 }
, { name = "Bob", revenue = 200 }
]
|> DataFrame.withColumns
[ Expr.col "revenue"
|> Expr.mul (Expr.lit 2)
|> Expr.named "double_revenue"
]
|> DataFrame.columns
Try itExpr.col "name"references a column by nameExpr.lit valuecreates a constant expression from an Int, Float, or StringExpr.named "alias" exprrenames the output column
Arithmetic
Combine expressions with arithmetic operators:
-- Arithmetic expressions: column * column
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { price = 10, quantity = 3 }
, { price = 20, quantity = 2 }
]
|> DataFrame.withColumns
[ Expr.col "price" |> Expr.mul (Expr.col "quantity") |> Expr.named "total"
]
|> DataFrame.columns
Try itAvailable: add, sub, mul, div, mod, pow.
Comparison and Boolean Logic
-- Comparison expressions: column >= literal
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "Alice", age = 25 }
, { name = "Bob", age = 15 }
]
|> DataFrame.withColumns
[ Expr.col "age" |> Expr.gte (Expr.lit 18) |> Expr.named "is_adult"
]
|> DataFrame.columns
Try itComparison: eq, neq, gt, gte, lt, lte.
Boolean: and, or, not.
Conditional Expressions
Use cond for if-then-else logic:
-- Conditional expressions with DataFrame.Expr
import DataFrame
import DataFrame.Expr exposing (col, lit)
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "Alice", score = 95 }
, { name = "Bob", score = 72 }
, { name = "Carol", score = 88 }
]
|> DataFrame.withColumns
[ Expr.cond
[ (col "score" |> Expr.gte (lit 90), lit "A") ]
(lit "B")
|> Expr.named "grade"
]
|> DataFrame.columns
Try itAggregations
Reduce columns to summary values:
-- Aggregation with DataFrame.Expr
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { group = "A", value = 10 }
, { group = "A", value = 20 }
, { group = "B", value = 30 }
]
|> DataFrame.groupBy ["group"]
|> DataFrame.aggExprs
[ Expr.col "value" |> Expr.sum |> Expr.named "total"
]
|> DataFrame.columns
Try itAvailable: sum, mean, min, max, count, first, last, std, var, median.
String Operations
Transform string columns:
-- String operations with DataFrame.Expr
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { name = "alice" }
, { name = "bob" }
]
|> DataFrame.withColumns
[ Expr.col "name"
|> Expr.strUpper
|> Expr.named "upper_name"
]
|> DataFrame.columns
Try itAvailable: strLength, strUpper, strLower, strTrim, strContains, strStartsWith, strEndsWith, strReplace.
Math Functions
-- Math functions: abs
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { value = -3.7 }
, { value = 4.2 }
]
|> DataFrame.withColumns
[ Expr.col "value" |> Expr.abs |> Expr.named "abs_value"
]
|> DataFrame.columns
Try itAvailable: abs, sqrt, floor, ceil, round.
Null Handling
-- Null handling: fillNull
import DataFrame
import DataFrame.Expr as Expr
DataFrame.fromRecords
[ { score = 85 }
, { score = 90 }
]
|> DataFrame.withColumns
[ Expr.col "score" |> Expr.fillNull (Expr.lit 0) |> Expr.named "score_filled"
]
|> DataFrame.columns
Try itWindow Functions
Apply expressions over partitions (SQL-style window functions):
import DataFrame
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { region = "East", revenue = 100 }
, { region = "East", revenue = 200 }
, { region = "West", revenue = 150 }
]
df
|> DataFrame.withColumns
[ Expr.col "revenue" |> Expr.sum |> Expr.over ["region"] |> Expr.named "running_sum"
]
|> DataFrame.columns
Try itWhen to Use Expressions vs Closures
| Use Expressions When | Use Closures When |
|---|---|
| Column arithmetic and comparisons | Complex logic needing full language features |
| Aggregations and window functions | Pattern matching on values |
| String transformations on columns | Calling other Keel functions |
| Performance is critical | Prototyping or one-off transforms |
Expressions compile to native Polars operations and benefit from SIMD vectorization, parallel execution, and query optimization. Use them for performance-critical data pipelines.
Next Steps
See the DataFrame stdlib page for the complete function reference, including how expressions integrate with selectExpr, filterExpr, and other DataFrame operations.