CloudQuery News

Announcing the Java SDK for CloudQuery Integration Development

•

3 min read

We're excited to announce the first release of a Java SDK for CloudQuery integration development! This SDK provides a high-level toolkit for developing CloudQuery plugins in Java.

Background #

CloudQuery is designed with a plugin-based architecture and uses Apache Arrow over gRPC for communication between plugins. Source and destination integrations are independent of one another, and this architecture allows integrations to be written in different languages but still communicate with one another.

Originally, we only provided an SDK for writing integrations in Go only, but that is changing now. Recently, we released the CloudQuery SDK for Python, the CloudQuery SDK for JavaScript, and now we are excited for the next language in line: Java!

Features #

Plugin Server #

The most basic functionality provided by the Java SDK is to start a gRPC plugin server that supports all the flags expected by the CloudQuery CLI. This allows you to write an integration in Java and run it using the same command line interface as any other integration.

The following example shows how to create a integration server that runs an integration called MyPlugin:

import io.cloudquery.server.PluginServe;

public class MainClass {

  public static void main(String[] args) {
    MyPlugin plugin = new MyPlugin();
    PluginServe pluginServe = PluginServe.builder().args(args).plugin(plugin).build();
    int exitCode = pluginServe.Serve();
    System.exit(exitCode);
  }
}

Plugin Class #

A CloudQuery Java source plugin, such as the MyPlugin above, should extend the io.cloudquery.plugin.Plugin and needs to implement the following three methods: newClient, tables and sync.

The newClient method is called when the integration is started, and is where you can do any initialization work.

The tables method should return a list of tables that the integration supports.

The sync method is called when a table needs to be synced. This is where the SDK scheduler can be used to manage the syncing of all the supported tables.

Check out our Bitbucket plugin for an example implementation.

Multi-threaded Scheduler #

The scheduler's main responsibilities are to manage concurrent execution of requests and the order in which tables are synced to avoid dependency issues. It also places limits on the number of concurrent requests and memory usage.

To invoke the scheduler, the sync method of a integration should pass a list of its tables and options to the scheduler. The scheduler will take care of the rest. Here is an example from the CloudQuery Bitbucket integration:

@Override
  public void sync(
      List<String> includeList,
      List<String> skipList,
      boolean skipDependentTables,
      boolean deterministicCqId,
      BackendOptions backendOptions,
      StreamObserver<Sync.Response> syncStream)
      throws SchemaException, ClientNotInitializedException {
    if (this.client == null) {
      throw new ClientNotInitializedException();
    }

    List<Table> filtered = Table.filterDFS(allTables, includeList, skipList, skipDependentTables);
    Scheduler.builder()
        .client(client)
        .tables(filtered)
        .syncStream(syncStream)
        .deterministicCqId(deterministicCqId)
        .logger(getLogger())
        .concurrency(spec.getConcurrency())
        .build()
        .sync();
  }

Docker for Cross-Platform Distribution #

To support cross-platform packaging of Java integrations, we introduced a new docker registry type to the CloudQuery CLI in v3.12.0. Where Go-based integrations are downloaded as binaries from GitHub releases, Java integrations are downloaded as Docker images from the specified Docker registry. This allows CloudQuery to support multiple platforms, and also makes it easier to distribute integrations that have dependencies on external libraries.

Start Creating Your Own Plugin #

Want to start writing your own integration? Here is our guide to get you started.

Feedback #

We'd love to hear your feedback on the Java SDK. If you have any questions, comments, or suggestions, please feel free to reach out to us on the CloudQuery Community or GitHub.

Want to see CloudQuery in action? Schedule a demo with our team or check out the platform documentation to learn more.

Want help getting started? Join the CloudQuery community to connect with other users and experts, or message our team directly here if you have any questions.

CloudOps