dbt — tool для transforming data в warehouse через SQL. Paradigm: define models как SQL select statements, dbt compiles DAG, материализует в таблицы/views, runs tests, generates docs. Core того, что называется "modern data stack". Open-source dbt-core + SaaS dbt Cloud. Used by: Airbnb, Monzo, HelloFresh, 1000s startup data teams.
Ниже: подробности, пример, смежные термины, FAQ.
-- models/orders_summary.sql
{{ config(materialized='table') }}
SELECT
DATE_TRUNC('day', order_date) AS day,
COUNT(*) AS orders,
SUM(amount) AS revenue
FROM {{ ref('orders') }}
WHERE status = 'completed'
GROUP BY 1
-- schema.yml
models:
- name: orders_summary
columns:
- name: day
tests: [not_null, unique]Core: free, CLI, self-host. Cloud: SaaS + web IDE + scheduling + docs hosting + CI/CD, $100-200/dev/мес. Для small teams — core + Airflow; enterprise — Cloud.
SQLMesh (newer, Python-based), Apache Airflow tasks, Dataform (Google). Для non-SQL ELT: Fivetran/Airbyte + Python.
materialized="incremental" + unique_key — dbt детектирует changed rows, runs INSERT/UPDATE только для них. Huge cost savings vs full refresh.