CDC (Change Data Capture) — pattern streaming database changes in real-time to downstream systems. Instead of periodic SELECT over the whole table, read the transaction log (Postgres WAL, MySQL binlog). Tools: Debezium (most popular, Kafka Connect based), AWS DMS, Maxwell, Airbyte. Use cases: sync DB → search index (Elasticsearch), DB → cache (Redis), DB → data warehouse (Snowflake), event-driven arch.
Below: details, example, related terms, FAQ.
# Debezium Kafka Connect config for Postgres
{
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "pg.internal",
"database.dbname": "mydb",
"slot.name": "debezium_slot",
"publication.name": "debezium_pub",
"topic.prefix": "mydb"
}
# Output: Kafka topics mydb.public.users, mydb.public.orders, ...CDC — capture from an existing DB (transparent to apps). ES — the DB itself is an event log (app writes events). Complementary, not synonyms.
Yes, Red Hat backing. Netflix, Wepay, Shopify in production. Main gotcha — schema changes require careful handling.
Simpler setup, but misses deletes, high DB load, latency. Debezium log-based — no SELECT on source.