engineering
Introducing Transformer Plugins
Mariano Gappa •
At CloudQuery, we've become pretty good at providing a simple interface for ELT, with a wealth of sources and destinations and comprehensive compliance transformations. However, there has been one feature that the CloudQuery Developer Community has repeatedly been asking for, the ability to make transformations on your data as it's being loaded.
The new Transformer Plugins solve these common requests we’ve seen from our developer community:
- Removing unneeded fields that waste space in your data destinations.
- Obfuscate some fields that contain sensitive personally identifiable information (PII) from the user's clients.
- Prefix all tables created by the sync.
There are also more advanced requests involving customized transformations, sometimes using proprietary transformation stacks.
Today, we're unveiling a new type of plugin alongside Source Plugins and Destination Plugins: the Transformer Plugin! 🤖
Transformer Plugins #
Transformer plugins sit in the middle of the pipeline between sources and destinations, allowing both content and schema transformations on the data as it passes through.
Supporting multiple destinations #
Transformers are configured per-destination, so you can apply different transformers to different destinations.
Configuring transformations #
Today, we're releasing the first transformer plugin, enabling users to perform the most requested transformations with a simple YAML-based configuration interface.
Given an AWS source
kind: source
spec:
name: "aws"
path: cloudquery/aws
registry: cloudquery
version: "v27.8.0"
destinations: ["postgresql"]
tables: ["*"]
spec:
And a Postgres destination
kind: destination
spec:
name: "postgresql"
path: "cloudquery/postgresql"
registry: "cloudquery"
version: "v8.0.7"
write_mode: "overwrite-delete-stale"
transformers:
- basic # we add the basic transformer here
spec:
connection_string: "..."
Using the following transformer, you can transform your data by obfuscating, removing and adding columns, and changing table names in the destination database.
kind: transformer
spec:
name: "basic"
registry: cloudquery
path: "cloudquery/basic"
version: "v1.0.0"
spec:
transformations:
- kind: obfuscate_columns
tables: ["aws_secretsmanager_secrets"]
columns: ["kms_key_id"]
- kind: remove_columns
tables: ["aws_secretsmanager_secrets"]
columns: ["rotation_rules", "policy"]
- kind: add_column
tables: ["*"]
name: "source"
value: "cq_sync"
- kind: change_table_names
tables: ["*"]
new_table_name_template: "cq_{{.OldName}}"
Advanced transformation use cases #
It's early days, and we're still working on more advanced transformers. As it’s the case for source & destination plugins, we’re also enabling the CloudQuery community to develop their own transformer plugins, with a straightforward interface. Stay tuned for the upcoming guide on developing custom transformer plugins. In the meantime, feel free to reach out to us on our CloudQuery Community Discord with questions and feature requests for our new transformations framework.