dbt — a tool for transforming data in a warehouse via SQL. Paradigm: define models as SQL select statements, dbt compiles the DAG, materialises into tables/views, runs tests, generates docs. The core of what is called the "modern data stack". Open-source dbt-core + SaaS dbt Cloud. Used by: Airbnb, Monzo, HelloFresh, thousands of startup data teams.
Below: details, example, related terms, FAQ.
-- models/orders_summary.sql
{{ config(materialized='table') }}
SELECT
DATE_TRUNC('day', order_date) AS day,
COUNT(*) AS orders,
SUM(amount) AS revenue
FROM {{ ref('orders') }}
WHERE status = 'completed'
GROUP BY 1
-- schema.yml
models:
- name: orders_summary
columns:
- name: day
tests: [not_null, unique]Core: free, CLI, self-host. Cloud: SaaS + web IDE + scheduling + docs hosting + CI/CD, $100-200/dev/mo. For small teams — core + Airflow; enterprise — Cloud.
SQLMesh (newer, Python-based), Apache Airflow tasks, Dataform (Google). For non-SQL ELT: Fivetran/Airbyte + Python.
materialized="incremental" + unique_key — dbt detects changed rows, runs INSERT/UPDATE only for them. Huge cost savings vs full refresh.