Esc
Start typing to search...
v0.0.20

The Data Language That Compiles to Confidence

Build fast. Ship confidently. Compile-time data contracts, automatic lineage, strong types, and column safety — because finding bugs in production isn't a feature.

keel
Copied!

What Makes Keel Different

Data Contracts at Compile Time

CSV, Parquet, and JSON schemas validated when you compile, not when you run. Data contracts enforced by the type system, not by documentation.

Compile-Time Column Safety

Reference a column that doesn't exist? The compiler tells you before deployment. Schema drift caught immediately with helpful 'Did you mean?' suggestions.

Automatic Data Lineage

Every transformation tracked automatically. Full data provenance from source to result without writing extra code or using external tools.

Strong Typing

Keel's type system extends to your data. Column types, nullability, and schema shapes are tracked statically — mismatches are compiler errors, not runtime surprises.

Functional Style

Build pipelines from composable, pure transformations. Chain operations that are easy to reason about, test in isolation, and refactor without fear.

Speed

Compiled to bytecode by a register-based compiler. Faster than Python without the complexity of native toolchains.

View benchmarks →

Code Examples

keel
Copied!
Run
-- Type-safe column operations validated at compile time
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
import Result

let sales =
    case DataFrame.readParquet "public/data/sales.parquet" of
        Ok df -> df
        Err _ -> DataFrame.fromRecords []

let taxExpr = col @amount * 0.08

let totalExpr = col @amount + col @tax

-- Expr-based column additions — compiler validates column names and types
sales
    |> DataFrame.applyExprs [(@tax, taxExpr)]
    |> Result.withDefault (DataFrame.fromRecords [])
    |> DataFrame.applyExprs [(@total, totalExpr)]
    |> Result.withDefault (DataFrame.fromRecords [])
    |> DataFrame.select
        [ @transaction_id
        , @amount
        , @tax
        , @total
        ]
    |> Result.withDefault (DataFrame.fromRecords [])

-- Schema drift caught at compile time, not in production

Who Uses Keel

Built for teams that need data contracts, lineage, and compliance.

By Role

Data Engineers

Build data pipelines with data contracts enforced at compile time. Schema changes detected immediately, not in production.

Data Governance Teams

Enforce data quality with compile-time validation. Data contracts prevent schema drift and ensure downstream reliability.

Data Scientists

Spend less time debugging schema mismatches and more time on analysis. Strong types and compile-time column safety catch mistakes before they reach your results.

Academics & Researchers

Automatic lineage gives you built-in provenance for every transformation. Reproducible pipelines that document themselves — no extra tooling required.

By Industry

Financial Services

SOC2 and regulatory compliance with automatic audit trails. Full data lineage for transaction processing and risk calculations.

Healthcare & Life Sciences

HIPAA and GDPR compliance with provenance tracking. Every transformation documented for regulatory audits.

Insurance

Actuarial models and claims pipelines where data errors are costly. Compile-time validation and full lineage for regulatory and internal audit requirements.

Research Institutes

Reproducible data pipelines with automatic provenance. Meet open-data and grant compliance requirements without bolting on extra tooling.

Data Contracts: Compile Time vs Runtime

Keel validates data contracts at compile time, catching issues before deployment rather than in production.

FeatureKeelPython/PandasdbtGreat Expectations
Data ContractsCompile-time (enforced by compiler)Runtime (manual checks)Test-time (separate layer)Runtime (validation library)
Schema ValidationAutomatic at compileManual (try/except)YAML definitions + testsPython schemas + validation
Column SafetyType-checked before runKeyError at runtimeSQL compilationRuntime assertions
Data LineageBuilt-in automaticManual trackingMetadata in dbt docsSeparate expectation history
When Errors FoundBefore deploymentIn productionCI/CD pipelineIn production
PerformanceOptimized executionInterpreted overheadSQL engine dependentPython runtime

Installation

bash
Copied!
# Download the latest release for your platform:
# https://codeberg.org/Keel/keel-cli/releases
#
# Linux (x86_64):
curl -L https://codeberg.org/Keel/keel-cli/releases/download/v0.0.20/keel-v0.0.20-linux-x86_64.tar.gz | tar xz
sudo mv keel /usr/local/bin/

# Windows (x86_64): download keel-v0.0.20-windows-x86_64.zip

Prerequisites: Rust 1.85+ and Cargo (for building from source) or Nix with flakes enabled (for Nix installation)

Roadmap

What's shipped, what's next, and what's on the horizon.

Core Language

Completed

Lexer, parser, type inference (HM-style), pattern matching, opaque newtypes, REPL, LSP, tree-sitter grammar

DataFrame & Matrix Library

Completed

Polars-backed columnar data with compile-time schema validation, window functions, and a first-class Matrix type

Time-Machine Debugger

Completed

Step forward and backward through execution with full environment snapshots via keel trace

Execution Kernel & Server

Completed

Stateless HTTP/WebSocket execution service for the playground and remote runs

Audit Trail

Planned

Machine-readable run manifests recording inputs, outputs, SHA-256 hashes, and status — for regulated workflows (FDA, SOX, SR 11-7)

Type Classes

Planned

User-definable polymorphic interfaces enabling generic programming beyond built-in constraints

DuckDB Connector

Planned

Load DataFrames from DuckDB files and execute SQL queries with compile-time schema inference

Parallel Execution

Planned

Automatic parallel scheduling of independent let-bindings via compile-time wave graph and Rayon

WASM Extension System

Planned

Third-party Rust crates compiled to WASM and loaded as first-class keel modules with full type checking and LSP support

Python & R Interop

Planned

Call Python and R libraries from keel with automatic type bridging — use pandas, scikit-learn, ggplot2, and other ecosystem packages directly

GPU Acceleration

Planned

Transparent GPU dispatch for DataFrame and Matrix operations via wgpu, with CPU fallback

Frequently Asked Questions