AWS · Open Table Format / Data Lakehouse

Apache Iceberg

Built for cloud teams

Recover Apache Iceberg tables to any point in time — without managing backup infrastructure. Iceberg-aware, air-gapped, and API-first from day one.

Get started

Storage tier

SecureVault Standard

Restore type

Full

In-Place (Backtrack)

Point-in-Time

Selected Snapshots

Cross-region

Supported

Cross-account

Supported

01 · Why Clumio

Why pick Clumio for Iceberg

Native Iceberg snapshots solve versioning. They don’t solve cyber resilience, multi-year retention, or cross-account recovery. Clumio offers the protection model that the lakehouse leaves to you.

ICEBERG-AWARE

Restores are queryable on arrival

Backups are Iceberg-aware, capturing data files, metadata, and the snapshot chain together while preserving schema specifications end-to-end. Restored tables come back queryable as soon as the catalog updates, with no metadata repair or configuration changes.

INCREMENTAL

Incremental-forever on snapshot lineage

The seed helps transfers the full snapshot chain; subsequent runs only carry deltas. Tighter RPO allows for increased snapshot churn.

BACKTRACK

In-place restore, no table rewiring

Roll the source Iceberg table back to an earlier snapshot in place. Table identity (catalog name, ARN, IAM, downstream references) stays intact. Operates on the Iceberg manifest chain rather than recreating the table.

FLEXIBLE RECOVERY

Pick the right recovery shape

Restore a full table, one or multiple selected snapshots, or a specific point in time. Preview schema on every snapshot so you can validate the restore target matches what you expect. You can even change the top snapshot version to one of your choosing.

NO LAKE IMPACT

Years of retention, zero source overhead

Vault retention is independent of the source table’s snapshot history. Drop snapshot history in the lake without losing the ability to restore from a year ago.

MIGRATION PATH

Migrate Glue and S3 Tables in either direction

Restore Glue-managed Iceberg backups into Amazon S3 Tables, or restore S3 Tables backups into a Glue catalog. Clumio helps update metadata to the target convention during restore; the table is queryable rapidly, with minimal to no manual reconciliation needed.

New to Clumio?

Set up your AWS account first

This page assumes a connected AWS account with at least one Iceberg table. If you haven’t done that yet, the Getting Started guide will walk you through sign-up, account connection, and first backup quickly.

Get started

02 · Backup

How to back up Iceberg

Apply one Clumio policy to your Iceberg tables, whether they live in the AWS Glue Data Catalog or in Amazon S3 Tables. The policy defines schedule, retention, tier, and region. (For initial AWS account setup, see the Getting Started guide.)

Get started

01

Create a backup policy

A policy helps define schedule, retention, target region, and tier. Iceberg policies operate at the table level. Each backup captures the table’s data files, metadata files, and manifest list together, so the recovery point represents a transactionally consistent Iceberg snapshot rather than a smeared crawl.

Protect → Backup policies →

02

Pick the right RPO

The policy schedule sets the cadence (e.g., daily, weekly), with an optional start time. The seed transfers the full snapshot chain; subsequent runs are incremental-forever relative to Iceberg snapshot lineage. Tighter RPO allows for increased snapshot churn.

03

Choose a tier (SecureVault Standard)

Iceberg backups land in SecureVault Standard, an air-gapped tier outside your source account, with fast restore and the full set of restore granularities. The same tier covers Iceberg tables stored in either of two AWS surfaces. Apply one policy to both surfaces; the workload model is identical, only the underlying metadata location differs.

AWS Glue Data Catalog

Iceberg tables registered to a Glue database, with data and manifest files in general-purpose S3. Common shape for teams already on Glue + Athena.

Amazon S3 Tables

Fully managed Iceberg in S3 Tables buckets. AWS handles compaction, snapshot management, and unreferenced-file cleanup; Clumio doesn’t disturb that maintenance.

Protect → Iceberg policies → Backup tier →

04

Choose a region (in-region or out-of-region)

By default, Clumio stores backups in the same region as the source table. You can target a different region for cross-region durability, which adds AWS data transfer cost. The destination is set on the policy.

05

Apply the policy with protection rules

Once the policy is saved, use protection rules to apply it to Iceberg tables. For Glue Data Catalog tables, target individual tables directly. For Amazon S3 Tables, target by AWS tag, account, name, or region. Tables can also be excluded by tag for fine-grained control. The seed backup runs first; subsequent backups are incremental.

Set up → Protection rules →

03 · Restore

How to restore Iceberg

An Iceberg restore comes down to three choices: when to recover from, what to recover, and where it lands.

WHENPick the recovery point

An exact timestamp from the retention window, or a discrete snapshot from the calendar.

Point-in-time restore

Pick any timestamp within the retention window. Clumio walks back to the Iceberg snapshot that was current at that instant. Useful for “rewind to right before the bad data” recoveries; no need to know snapshot IDs.

Restore → Table → Point-in-time →

Restore from a selected backup

Pick a discrete backup from the table’s calendar and drill into it. Inspect the Iceberg snapshots captured in that backup, plus prior snapshots still retained in air-gapped storage; preview the schema, size, and the operation that produced each one (append, overwrite, delete).

Restore → Table → Calendar →

WHATPick the granularity

From a full table down to a single snapshot, with the same workflow.

Full table

All retained, air-gapped snapshots for the table land together, with data files, metadata, and the full snapshot manifest chain preserved end-to-end. The default and most common shape.

Selected snapshots

Restore a single snapshot or a chosen subset of the lineage. Useful for rebuilding the table at a specific point, replaying forward without older history, or updating the head snapshot of an existing table to roll it forward or back.

WHEREPick the destination

Restore back into the source table or land somewhere else entirely.

In-place restore via Backtrack

Roll the source Iceberg table back to an earlier snapshot in the same catalog, same account. Table identity (catalog name, ARN, IAM, downstream references) stays intact. Operates on the Iceberg manifest chain rather than recreating the table.

Restore → Backtrack →

Out-of-place (cross-region or cross-account)

Land the table in a different AWS account, region, Glue database, or S3 Tables bucket. Cross-account is the recovery path after a source-account compromise, since the restore target sits outside the blast radius. Same workflow can double as a migration path between Glue and S3 Tables in either direction.

Restore → Filters & targets →

05 · Common questions

Frequently asked questions

Questions from engineers setting up Iceberg protection or troubleshooting restores.

Are Clumio Iceberg backups incremental, or does every backup capture the full table?

Backups are incremental at the snapshot level. Each backup can help walks the table’s snapshot lineage, finds the snapshots that landed since the prior backup, and captures their new data and metadata files. Older snapshots already in air-gapped storage are referenced rather than re-uploaded, and the full manifest chain is preserved so every backup can remains a complete, restorable point in time.

Does Clumio support Iceberg tables in both AWS Glue Catalog and Amazon S3 Tables, and which engines can query a restored table?

Yes, for Iceberg tables specifically (non-Iceberg Glue tables are not in scope). Both surfaces are first-class: the same backup policy applies to either, and tables can be restored across them as a migration path. After restore, the table comes back as a valid Iceberg table at the destination catalog, queryable by Athena, EMR Spark, Trino, Snowflake, and any other Iceberg-compatible engine with minimal or noout manual metadata repair.

Which Apache Iceberg spec versions does Clumio support?

Clumio supports Iceberg format v1 and v2 tables today. Position deletes, equality deletes, and the row-level operations introduced in v2 are captured in backups and restored intact. Support for v3 is coming soon.

Can I back up only compacted snapshots and skip the intermediate write churn?

By default, each backup captures all available snapshots in the table’s lineage, so the protected history mirrors the source. New filtering modes are in development that are designed to let policies capture only the latest snapshot at backup time, or only the compacted snapshot (skipping the intermediate write-heavy ones), for teams that don’t need every intermediate state retained.

Where do the restored data files land?

You choose at restore time. In-place restore writes data files back into the source table’s S3 location and updates the source Glue or S3 Tables catalog entry. Out-of-place restore writes to a destination bucket and catalog you specify (different table name, different database, different account, or different region). The destination must be reachable from a Clumio connector in the target account.

06 · Related resources

Go deeper

Blog posts and reference material for teams building on Clumio Iceberg protection.

Blog

Simplifying Your Migration to Amazon S3 Tables with Clumio

How Clumio’s restore flow can doubles as an Iceberg-aware migration path from Glue-managed Iceberg into fully managed Amazon S3 Tables. Preserves metadata, schema, and snapshot lineage.

Solution brief

Clumio for Apache Iceberg on AWS

Two-page positioning brief. Why native Iceberg snapshots fall short for cyber resilience, and how Clumio’s Iceberg-aware, air-gapped architecture closes those gaps. Includes the IDC analyst take.

Demo

Protect Apache Iceberg Tables with Air-Gapped Clumio Backups

Backup-side demo. Walks through Iceberg-aware backup architecture, the case for air-gapped protection over native snapshots, policy creation, inventory and calendar views, and snapshot inspection.

Demo

Clumio – Apache Iceberg Restore

Companion restore demo. Walks through restoring AWS Glue-based and S3-based Iceberg tables so they come back fully operational and queryable immediately. Minimal to no manual metadata repair or extra reconciliation.

Developer tools

Amazon Web Services

Google Cloud

Microsoft Azure

Coming Soon