Migrating from CloudQuery v0 to v1
October 3, 2022
We are thrilled to announce the release of the first major version of CloudQuery--see our v1 announcement blog post for details! With the new release comes a range of new exciting features, and this page is here to help you migrate an existing CloudQuery installation from v0 to v1.
Changes in V1
The announcement blog post lists many of the important improvements, and we won't re-iterate them all here. Most changes are internal and developer-facing, but some do impact existing CloudQuery teams. Those are:
Changes to the Configuration Format
V1 introduces a new config format that is closely related to the old one, but an old config will need some massaging to work with the CloudQuery v1 CLI. Mostly because we now support multiple destinations, there are separate configs for source and destination plugins.
The new config format for source plugins are as follows:
kind: source spec: ## Required. name of the plugin to use name: "aws" # required # Required. Must be a specific version starting with v, e.g. v1.2.3 version: "v18.1.0" ## Optional. Default: "github". Available: "local", "grpc" # registry: github ## Plugin path. For official plugins, this should be in the format "cloudquery/<name>", e.g. "cloudquery/aws" path: "cloudquery/aws" ## Required. You can use ["*"] to sync all tables or specify specific tables. Please note that syncing all tables can be slow ## See all tables: https://www.cloudquery.io/docs/plugins/sources/aws/tables tables: ["aws_s3_buckets"] ## Required. all destinations you want to sync data to. destinations: ["postgresql"] spec: # plugin specific configuration.
Check the source spec documentation for general layout, and individual plugin documentation for details on how to configure the plugin-specific spec. Generally these will be the same as in v0, and all the same authentication functionality is still supported.
The new config format for destination plugins (e.g. PostgreSQL, BigQuery, Snowflake, and more) is as follows:
kind: destination spec: ## Required. name of the plugin name: "postgresql" path: "cloudquery/postgresql" # Required. Must be a specific version starting with v, e.g. v1.2.3 version: "v4.2.1" ## Optional. Default: "overwrite". Available: "overwrite", "append", "overwrite-delete-stale". Not all modes are ## supported by all plugins, so make sure to check the plugin documentation for more details. write_mode: "overwrite" # overwrite, overwrite-delete-stale, append spec: ## plugin-specific configuration for PostgreSQL: ## Required. Connection string to your PostgreSQL instance connection_string: "postgresql://postgres:pass@localhost:5432/postgres?sslmode=disable"```
Check the destination spec documentation for general layout, and individual destination plugin documentation for details on how to configure the plugin-specific spec part. Generally these will be the same as in v0, and all the same authentication functionality is still supported.
Changes to the CLI Commands
Users of CloudQuery v0 would be familiar with the main commands
fetch. These have changed in v1 and
init is longer available (you should write config files manually).
init was a command that generated a starter configuration template, but it is no longer a command in v1 of the CLI. Instead, please refer to our Quickstart guide to see how source and destination plugins should be configured.
init command also generated a full list of tables to fetch. In v1, you can fetch all tables by using a wildcard entry:
in the source configuration file. This can also be combined with the
skip_tables option to fetch all tables except some subset:
tables: ["*"] skip_tables: ["aws_accessanalyzer_analyzers", "aws_acm_certificates"]
cloudquery sync replaces the v0
cloudquery fetch command.
Functionally it is still the same: it loads data from a source to a destination, but
sync now supports multiple destinations, while
fetch only supported PostgreSQL. With this change also comes a change in expected config format, see the next section for more details on this.
cloudquery sync needs to be passed a path to a config file or directory containing config files. So for example, to sync using all
.yml files in a directory named
cloudquery sync config/
Or to sync using a single YAML file named
cloudquery sync config.yml
In this case
config.yml should contain at least one source and one destination config, each separated by a line containing three dashes (
---). More about this in Files and Directories.
cloudquery sync --help for more details, or check our online reference.
Files and Directories
sync command supports loading config from files or directories, and you may choose to combine multiple source- and destination- configs in a single file using
--- on its own line to separate different sections. For example:
kind: source spec: name: "aws" version: "v18.1.0" # rest of source spec here --- kind: destination spec: name: "postgresql" version: "v4.2.1" # rest of destination spec here
Changes to Tables and Schemas
Finally, during our work for v1, we endeavoured to make the table schemas more consistent, predictable and aligned with their upstream APIs. As such, some breaking changes to the schema were necessary. Each source plugin has its own schema migration guide to help you make the necessary changes to your custom queries, triggers and policies:
- AWS (opens in a new tab)
- Azure (opens in a new tab)
- CloudFlare (opens in a new tab)
- DigitalOcean (opens in a new tab)
- GCP (opens in a new tab)
- GitHub (opens in a new tab)
- Heroku (opens in a new tab)
- K8s (opens in a new tab)
- Okta (opens in a new tab)
- Terraform (opens in a new tab)
Note that these guides are (for the most part) automatically generated, so in some cases a table may be marked as removed when it was actually renamed. Please reach out to us if you find any errors.
Start from a clean Database
V1 introduces functionality to automatically perform backwards-compatible Postgres migrations when new columns or tables are added. However, this functionality relies on a clean start being made in V1, and if you try to run it against a database with tables from v0, there is a good chance it will fail.
Therefore, it is important that you start from a clean database. This can either mean creating a new database and pointing the v1 configuration there, or dropping all the tables in your v0 database.
Get Help / Ask Questions
If you run into issues not covered here, or have any questions about migrating or CloudQuery v1, don't hesitate to reach out on Discord (opens in a new tab). We're a friendly community and would love to help however we can.