CDC (Change Data Capture) — pattern streaming database changes в real-time в downstream systems. Вместо периодически SELECT всей таблицы, читается transaction log (Postgres WAL, MySQL binlog). Tools: Debezium (most popular, Kafka Connect based), AWS DMS, Maxwell, Airbyte. Use cases: sync DB → search index (Elasticsearch), DB → cache (Redis), DB → data warehouse (Snowflake), event-driven arch.
Ниже: подробности, пример, смежные термины, FAQ.
# Debezium Kafka Connect config для Postgres
{
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "pg.internal",
"database.dbname": "mydb",
"slot.name": "debezium_slot",
"publication.name": "debezium_pub",
"topic.prefix": "mydb"
}
# Output: Kafka topics mydb.public.users, mydb.public.orders, ...CDC — capture из existing DB (transparent for apps). ES — DB itself является event log (app writes events). Дополняющие, не synonyms.
Да, Red Hat backing. Netflix, Wepay, Shopify в production. Основной gotcha — schema changes require careful handling.
Проще setup, но miss deletes, high DB load, latency. Debezium log-based — no SELECT на source.