product

What's new in CloudQuery Plugin Protocol v3

Yevgeny Pats

Yevgeny Pats

We are thrilled to announce the release of the new gRPC Plugin Version v3, which brings exciting enhancements to writing CloudQuery plugins!
The gRPC protocol is the underlying protocol which enables CloudQuery plugins to be decoupled from one another, which is crucial for a system with an unlimited number of integrations.
This blog covers the gRPC changes. For language-specific changes, see Go SDK V4.

Overview #

Let's begin with a high-level summary before diving deeper into each of the updates:
  • Apache Arrow - protobuf V3 now fully supports Apache Arrow, which was introduced in protobuf v2 (Sources) and protobuf v1 (Destinations).
  • Unified Protocol for Sources and Destinations - We have streamlined the gRPC protocol to utilize a single protocol. This allows CloudQuery plugins to function as both a source and a destination simultaneously, simplifying gRPC versioning and updates.
  • Transition Sync/Write to Streaming API - We have transitioned the Sync and Write operations to a streaming-based API, enabling new use cases like Change Data Capture (CDC).

Apache Arrow #

We extensively discussed this update in our previous blog post. Now, with all our destinations migrated to SDK V4, they support over 30 different Apache Arrow data types!
All fields that are sending CloudQuery tables are encoded as Apache Arrow Schemas, and all fields sending data are Apache Arrow records (Fields are commented in the code).

Unified Protocol #

Updates to plugin-pb are vital for introducing new features and enhancing the developer experience for plugin authors and end users.
Having a single protocol version ensures better manageability of upgrades and maintains backward compatibility, providing users and developers with ample time for transitioning. It also enables plugin authors to create plugins that function as both sources and destinations.
Moreover, any of the CloudQuery destinations can now be utilized as backends for incremental syncs. For example, you can now specify the following for sources with incremental syncs:
kind: source
spec:
  name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: 'VERSION_SOURCE_AWS'
  destinations: ['postgresql']
  backend_options:
    table_name: 'cq_state_aws'
    connection: '@@plugins.postgresql.connection'
  spec: ...
---
kind: destination
spec:
  name: postgresql
  path: cloudquery/postgresql
  registry: cloudquery
  version: 'VERSION_DESTINATION_POSTGRESQL'
  spec:
    connection_string: ....
@@plugins.X can reference any of the destinations specified in your config.
Lastly, this also opens the door for more easily creating a CloudQuery SDK for new languages, so stay tuned!

Streaming API #

One noteworthy use case on top of CloudQuery is Change Data Capture (CDC), which involves operations like creating or deleting tables during a sync.
With the new streaming support, the Write method now supports three messages:
  • MessageMigrateTable - Migrate table to a specific target table safely or force (dropping and re-creating)
  • MessageInsert - Insert Arrow Record
  • MessageDeleteStale - This is a specific message that might be generalized later to a more general delete message. This indicates a plugin to delete data from a table with _cq_source_name = source_name and _cq_sync_time < sync_time.
With this change to streaming that contains messages, we can easily extend the protocol in a backward-compatible way to support more messages like DeleteTable, DeleteRecord to support use cases such as CDC or other use cases that require those kinds of data updates.
Multiple messages, such as CreateTable, can easily be accommodated to enhance and facilitate such use cases. All available messages are available here.

Migration Guide #

To learn how to migrate Go plugins, please take a look at Go SDK V4 Update.

Summary #

We are thrilled about this major release! Don't hesitate to try your hand at developing a new CloudQuery plugin. Feel free to reach out to us on GitHub or Discord with any questions or feedback.
Subscribe to product updates

Be the first to know about new features.


© 2024 CloudQuery, Inc. All rights reserved.