Syncs
A CloudQuery sync fetches data from a source integration and delivers it to a destination. For example, you can sync AWS data to the built-in ClickHouse database, or sync GCP data to S3.
Sync
Source ───────> schedule ─ tables ───────> S3 → ClickHouse
(AWS)You can also sync data to additional destinations like PostgreSQL, BigQuery, or Snowflake simultaneously.
For CLI-specific sync configuration, write modes, and managing incremental state manually, see the CLI sync docs.
The platform delivers data to the destination as soon as the source produces it. Integrations may batch writes for performance reasons, but data generally arrives at the destination as the sync progresses.
What the Platform Manages for You
When syncing to the default S3 → ClickHouse destination, the platform handles write mode, incremental state, and table management automatically. You configure the schedule, select your source integration, and choose one or more destinations (or use the default S3 → ClickHouse). The platform takes care of the rest.
Specifically:
- Write mode is configured at the destination level, not per-sync. The default destination uses append mode. The platform creates views on top of the raw tables so you always query the latest snapshot of your data.
- Incremental table state (cursors and backends) is managed automatically. You don’t need to configure or maintain state backends.
- Table prefixing and views - synced tables are stored with a
raw_prefix, and the platform creates queryable views (e.g.cloud_assets) for use in the Asset Inventory, SQL Console, and Policies. See Data Model for how tables and views are organized.
The sections below explain how write modes and incremental tables work under the hood. To add destinations like PostgreSQL or BigQuery, see the General Destination Setup Guide.
Table Sync Modes
Table syncs come in two flavors: full and incremental. A single sync can combine both types, and which type is used for a particular table depends on the table definition. This is indicated in the table’s documentation in the CloudQuery Hub.
Full Table Syncs
This is the normal mode of operation for most tables. Every sync fetches a full snapshot from the corresponding APIs. How that data is written to the destination depends on the destination’s write mode:
- append - new rows are added alongside existing data from previous syncs.
- overwrite - new data replaces existing data, but stale rows from previous syncs are kept.
- overwrite-delete-stale - new data replaces existing data, and rows from previous syncs that are no longer present are deleted at the end of the sync.
The default S3 → ClickHouse destination only supports append mode. The platform creates views to surface the latest data.
Incremental Table Syncs
Some APIs lend themselves to being synced incrementally. Rather than fetch all past data on every sync, an incremental table only fetches data that has changed since the last sync. This is done by storing metadata in a state backend. The metadata is known as a cursor, and it marks where the last sync ended so the next sync can resume from the same point.
Incremental syncs are more efficient than full syncs, especially for tables with large amounts of data, because only the changed subset needs to be retrieved.
Incremental tables are marked as “incremental” in integration table documentation, along with which columns are used for the cursor value. On CloudQuery Platform, the state backend is managed for you. For CLI users managing their own state, see Managing Incremental Tables.
What Happens After a Sync
Once data lands in your destination, you can:
- Browse resources in the Asset Inventory: filter and search across all synced cloud assets.
- Query with SQL in the SQL Console: run ad-hoc queries or use the AI Query Writer to generate them from natural language.
- Enforce policies with Policies: define SQL-based checks that run continuously against your synced data.
- Build reports with Reports and Alerts: track metrics over time and get notified when conditions are met.
Using the CLI instead? See CLI Syncs for the self-hosted sync configuration, including parallel execution and state backend management.
Next Steps
- Setting up a sync - configure and schedule syncs on the platform
- Monitoring sync status - track sync progress and troubleshoot failures
- Integrations - how source, destination, and transformer integrations work
- Filters & Queries - search synced data in Asset Inventory and SQL Console
- Asset Inventory - browse and search your synced cloud resources
- Understanding Platform Views - how the platform creates views from raw synced tables
- Performance Tuning - optimize sync performance on the platform
- Browse available integrations on the CloudQuery Hub