The Data Language That Compiles to Confidence
Build fast. Ship confidently. Compile-time data contracts, automatic lineage, strong types, and column safety — because finding bugs in production isn't a feature.
What Makes Keel Different
Data Contracts at Compile Time
CSV, Parquet, and JSON schemas validated when you compile, not when you run. Data contracts enforced by the type system, not by documentation.
Compile-Time Column Safety
Reference a column that doesn't exist? The compiler tells you before deployment. Schema drift caught immediately with helpful 'Did you mean?' suggestions.
Automatic Data Lineage
Every transformation tracked automatically. Full data provenance from source to result without writing extra code or using external tools.
Strong Typing
Keel's type system extends to your data. Column types, nullability, and schema shapes are tracked statically — mismatches are compiler errors, not runtime surprises.
Functional Style
Build pipelines from composable, pure transformations. Chain operations that are easy to reason about, test in isolation, and refactor without fear.
Speed
Compiled to bytecode by a register-based compiler. Faster than Python without the complexity of native toolchains.
View benchmarks →Code Examples
-- Type-safe column operations validated at compile time
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
import Result
let sales =
case DataFrame.readParquet "public/data/sales.parquet" of
Ok df -> df
Err _ -> DataFrame.fromRecords []
let taxExpr = col @amount * 0.08
let totalExpr = col @amount + col @tax
-- Expr-based column additions — compiler validates column names and types
sales
|> DataFrame.applyExprs [(@tax, taxExpr)]
|> Result.withDefault (DataFrame.fromRecords [])
|> DataFrame.applyExprs [(@total, totalExpr)]
|> Result.withDefault (DataFrame.fromRecords [])
|> DataFrame.select
[ @transaction_id
, @amount
, @tax
, @total
]
|> Result.withDefault (DataFrame.fromRecords [])
-- Schema drift caught at compile time, not in productionWho Uses Keel
Built for teams that need data contracts, lineage, and compliance.
By Role
Data Engineers
Build data pipelines with data contracts enforced at compile time. Schema changes detected immediately, not in production.
Data Governance Teams
Enforce data quality with compile-time validation. Data contracts prevent schema drift and ensure downstream reliability.
Data Scientists
Spend less time debugging schema mismatches and more time on analysis. Strong types and compile-time column safety catch mistakes before they reach your results.
Academics & Researchers
Automatic lineage gives you built-in provenance for every transformation. Reproducible pipelines that document themselves — no extra tooling required.
By Industry
Financial Services
SOC2 and regulatory compliance with automatic audit trails. Full data lineage for transaction processing and risk calculations.
Healthcare & Life Sciences
HIPAA and GDPR compliance with provenance tracking. Every transformation documented for regulatory audits.
Insurance
Actuarial models and claims pipelines where data errors are costly. Compile-time validation and full lineage for regulatory and internal audit requirements.
Research Institutes
Reproducible data pipelines with automatic provenance. Meet open-data and grant compliance requirements without bolting on extra tooling.
Data Contracts: Compile Time vs Runtime
Keel validates data contracts at compile time, catching issues before deployment rather than in production.
| Feature | Keel | Python/Pandas | dbt | Great Expectations |
|---|---|---|---|---|
| Data Contracts | Compile-time (enforced by compiler) | Runtime (manual checks) | Test-time (separate layer) | Runtime (validation library) |
| Schema Validation | Automatic at compile | Manual (try/except) | YAML definitions + tests | Python schemas + validation |
| Column Safety | Type-checked before run | KeyError at runtime | SQL compilation | Runtime assertions |
| Data Lineage | Built-in automatic | Manual tracking | Metadata in dbt docs | Separate expectation history |
| When Errors Found | Before deployment | In production | CI/CD pipeline | In production |
| Performance | Optimized execution | Interpreted overhead | SQL engine dependent | Python runtime |
Installation
# Download the latest release for your platform:
# https://codeberg.org/Keel/keel-cli/releases
#
# Linux (x86_64):
curl -L https://codeberg.org/Keel/keel-cli/releases/download/v0.0.20/keel-v0.0.20-linux-x86_64.tar.gz | tar xz
sudo mv keel /usr/local/bin/
# Windows (x86_64): download keel-v0.0.20-windows-x86_64.zipPrerequisites: Rust 1.85+ and Cargo (for building from source) or Nix with flakes enabled (for Nix installation)
Roadmap
What's shipped, what's next, and what's on the horizon.
Core Language
CompletedLexer, parser, type inference (HM-style), pattern matching, opaque newtypes, REPL, LSP, tree-sitter grammar
DataFrame & Matrix Library
CompletedPolars-backed columnar data with compile-time schema validation, window functions, and a first-class Matrix type
Time-Machine Debugger
CompletedStep forward and backward through execution with full environment snapshots via keel trace
Execution Kernel & Server
CompletedStateless HTTP/WebSocket execution service for the playground and remote runs
Audit Trail
PlannedMachine-readable run manifests recording inputs, outputs, SHA-256 hashes, and status — for regulated workflows (FDA, SOX, SR 11-7)
Type Classes
PlannedUser-definable polymorphic interfaces enabling generic programming beyond built-in constraints
DuckDB Connector
PlannedLoad DataFrames from DuckDB files and execute SQL queries with compile-time schema inference
Parallel Execution
PlannedAutomatic parallel scheduling of independent let-bindings via compile-time wave graph and Rayon
WASM Extension System
PlannedThird-party Rust crates compiled to WASM and loaded as first-class keel modules with full type checking and LSP support
Python & R Interop
PlannedCall Python and R libraries from keel with automatic type bridging — use pandas, scikit-learn, ggplot2, and other ecosystem packages directly
GPU Acceleration
PlannedTransparent GPU dispatch for DataFrame and Matrix operations via wgpu, with CPU fallback
Frequently Asked Questions
Keel is currently in active development (v0.0.20). The core language works well for data pipelines and analysis, but the ecosystem is still growing. It's great for experimentation and projects where type safety matters, but we recommend waiting for 1.0 for mission-critical production workloads.
Keel catches data errors at compile time. The type system validates column types, function signatures, and data flow before your pipeline runs. Pattern matching forces you to handle missing values (Maybe) and errors (Result) explicitly — no silent nulls or unhandled exceptions. DataFrame operations are type-checked against Polars, so schema mismatches are caught early.
Keel catches errors at compile time that Python only finds at runtime. The type system ensures column names, data types, and function signatures are correct before your pipeline runs. The Polars-backed DataFrame library provides native-speed operations. However, Python's ecosystem is vastly larger.
Keel compiles to bytecode on a register-based VM. For data-intensive workloads, the Polars-backed DataFrame library provides native-speed operations with SIMD and parallel execution for query plans. General Keel code runs on the VM with garbage collection optimised for functional patterns.
Beyond standard types (Int, Float, String, Bool, List, Record, Tuple), Keel has first-class Date, Time, DateTime, Duration, and Decimal (28-digit precision) types. The DataFrame library supports Polars-backed tabular data with CSV/JSON/Parquet I/O. Distribution, Table, and ValueLabelSet modules cover statistical workflows.
Check out our Codeberg repository! We welcome contributions of all kinds: code, documentation, examples, and bug reports. See the Contributing Guide in the docs for setup instructions.