Skip to content

Apache Iceberg

Key idea:

Apache Iceberg — open table format for huge analytic tables. Adds ACID transactions, schema evolution, time travel, and flexible partitioning to Parquet/ORC files on S3. Started at Netflix (2018), now ASF top-level project. 2024 adoption: Snowflake Iceberg tables, BigQuery, Databricks, AWS S3 Tables native support. Competitor to Delta Lake (Databricks).

Below: details, example, related terms, FAQ.

Try it now — free →

Details

  • Metadata layer: tracks data files + partitions + statistics
  • ACID: snapshot isolation, write-audit-publish pattern
  • Schema evolution: add/drop columns without rewriting data
  • Time travel: query as of specific snapshot / timestamp
  • Hidden partitioning: partition by transform (year(ts)), no user impact

Example

-- Spark + Iceberg
CREATE TABLE prod.db.sales (
  id bigint,
  date date,
  amount decimal(18,2)
) USING iceberg
PARTITIONED BY (month(date));

-- Time travel
SELECT * FROM prod.db.sales
FOR TIMESTAMP AS OF '2026-03-01 00:00:00';

-- Schema evolution
ALTER TABLE prod.db.sales ADD COLUMN region string;

Related Terms

Learn more

Frequently Asked Questions

Iceberg vs Delta Lake?

Iceberg: open (ASF), multi-engine (Spark, Trino, Flink, Snowflake). Delta: Databricks-led, Spark-first. 2025+ convergence (Delta Uniform reads Iceberg).

Query engines?

Apache Spark, Trino, Dremio, Snowflake, Starburst, Presto, DuckDB, AWS Athena, Google BigQuery. Almost all analytic engines 2025+.

Production reliable?

Yes — Netflix PB-scale since 2019. Apple, Expedia, Pinterest, Adobe — all use it. ACID delivered, schema evolution tested in prod.