Back to plugin list
postgresql
Official
Premium

PostgreSQL

The CloudQuery PostgreSQL plugin syncs your PostgreSQL database to any of the supported CloudQuery destinations

Publisher

cloudquery

Latest version

v6.0.0

Type

Source

Platforms
Date Published

Price per 1M rows

Starting from $15

monthly free quota

10M rows

Set up process


brew install cloudquery/tap/cloudquery
1. Download CLI and login
See installation options
2. Create source and destination configsPlugin configuration

cloudquery sync postgresql.yml postgresql.yml
3. Run the syncCloudQuery sync

Overview

The CloudQuery PostgreSQL plugin syncs your PostgreSQL database to any of the supported CloudQuery destinations (e.g. PostgreSQL, BigQuery, Snowflake, and more). This plugin also supports CDC via PostgreSQL logical replication, which enables keeping your PostgreSQL up to date with any destination by subscribing to changes.

Authentication

In order to fetch information from PostgreSQL, cloudquery needs to be authenticated. Credentials can be inserted in the connection_string field.

Configuration

This example configures a PostgreSQL source, located at localhost:5432. The (top level) spec section is described in the Source Spec Reference.
kind: source
spec:
  name: "postgresql"
  registry: "cloudquery"
  path: "cloudquery/postgresql"
  version: "v6.0.0"
  tables: ["*"]
  destinations: ["postgresql"]
  spec:
    connection_string: "${POSTGRESQL_CONNECTION_STRING}" # set the environment variable in a format like postgres://postgres:pass@localhost:5432/postgres?sslmode=disable
    # Optional parameters:
    # cdc_id: "postgresql" # Set to a unique string per source to enable Change Data Capture mode (logical replication, or CDC)
    # pgx_log_level: error
    # rows_per_record: 500

PostgreSQL Spec

This is the (nested) spec used by the PostgreSQL source Plugin.
  • connection_string (string) (required)
    Connection string to connect to the database. This can be a URL or a DSN, as per pgxpool
    • "postgres://jack:secret@localhost:5432/mydb?sslmode=prefer" connect with tcp and prefer TLS
    • "postgres://jack:secret@localhost:5432/mydb?sslmode=disable&application_name=pgxtest&search_path=myschema&connect_timeout=5" be explicit with all options
    • "postgres://localhost:5432/mydb?sslmode=disable" connect with os username cloudquery is being run as
    • "postgres:///mydb?host=/tmp" connect over unix socket
    • "dbname=mydb" unix domain socket, just specifying the db name - useful if you want to use peer authentication
    • "user=jack password=jack\\'ssooper\\\\secret host=localhost port=5432 dbname=mydb sslmode=disable" DSN with escaped backslash and single quote
  • pgx_log_level (string) (optional) (default: error)
    Available: error, warn, info, debug, trace. Defines what pgx call events should be logged.
  • cdc_id (string) (optional)
    If set to a non-empty string the source plugin will start syncing CDC via PostgreSQL logical replication in real-time. The value should be unique across all sources.
  • rows_per_record (integer) (optional) (default: 500)
    Amount of rows to be packed into a single Apache Arrow record to be sent over the wire during sync (or initial sync in the CDC mode).


CDC

PostgreSQL CDC (Logical Replication) Configuration

Overview

Change Data Capture (CDC) is a set of software design patterns used to determine and track the data that has changed so that action can be taken using the changed data.
In PostgreSQL CDC can be implemented using Logical Replication.
Logical Replication is a feature of PostgreSQL that allows you to stream changes from a database to another database, file or custom handler. It is also used to keep a copy of a database up to date with the original database in an efficient manner (Internally it is implemented by the so called Write-Ahead Log).
In this document we won't go into details of how Logical Replication works internally, but we will show you how to enable it in number of environments and how to configure CloudQuery PostgreSQL source plugin that can stream the changes to any of CloudQuery supported destinations.
The PostgreSQL source plugin streams changes directly to any CloudQuery destination without the need for additional infrastructure (e.g. Kafka, RabbitMQ, etc). This means the setup is much easier.

PSQL Test

To test that the current PostgreSQL instance supports Logical Replication we can run the following command in PSQL:
SHOW wal_level;
or the following SQL query:
SELECT setting FROM pg_settings WHERE name='wal_level'
The default is replication but for CDC to work we need to set it to logical (This can only be done on the database startup).

Docker Setup

If you are running it locally for testing purposes you can use the following command to docker to enable logical replication:
docker run -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres:11 -c "wal_level=logical"

Self-hosted

Change the wal_level to logical in the postgresql.conf file and restart the database.

RDS

You will need to create a custom RDS parameter group, associate it with your RDS instance, set rds.logical_replication to 1 and restart the database.
See full resolution on AWS documentation

AWS Aurora serverless (V2)

AWS Aurora serverless V1 doesn't support Logical Replication, but V2 does.
Similar to RDS, you will need to create a custom parameter group, associate it with your Aurora serverless instance, set rds.logical_replication to 1 and restart the database.
See full resolution on AWS documentation

GCP Cloud SQL (PostgreSQL)

You will need to set the cloudsql.logical_deconging configuration to on. See full documentation on GCP documentation

Azure DB (PostgreSQL)

Please follow the Official Azure documentation on how to enable logical replication via the CLI or Console (UI).

Current Limitations

The following event types are not supported:
  • DDL changes (e.g. CREATE TABLE, ALTER TABLE, etc)
  • TRUNCATE statements (e.g. TRUNCATE TABLE table_name)
  • DELETE statements (e.g. DELETE FROM table_name)


Subscribe to product updates

Be the first to know about new features.