tutorial

Building a serverless open source CSPM powered by CloudQuery

Michal Brutvan

Michal Brutvan Mar 12, 2024

In this blog post, we will look at a variation of our guide on how to build an open source CSPM with CloudQuery and Grafana, only this time, you won’t need to set up any new infrastructure. We will use Neon, a serverless PostgreSQL database, a managed Grafana dashboard, and the new CloudQuery Cloud.

What Is CloudQuery Cloud?

CloudQuery Cloud is a great way to get started with CloudQuery and sync data from source to destination without the need to deploy your own infrastructure. You only need to select a source and destination plugin and CloudQuery will take care of the rest.
With CloudQuery Cloud, you can:
  • Schedule syncs to run at regular intervals.
  • Monitor syncs and view logs in the CloudQuery Cloud dashboard.
  • Use the CloudQuery Cloud API to manage the syncs and connect to your other data pipelines.
Here’s what our data flow will look like:
Serverless open source CSPM architecture
Let’s get to it.

Prerequisites

Sign up for a free account with Neon and create a new database. Make sure you store the connection string, you will need it later. Sign up for a free account with Grafana. Sign up for a free account with CloudQuery.
To pull data from AWS, we will use CloudQuery AWS plugin which will require an access key and the secret associated with it. See this AWS guide if you don’t have these yet.
You will also need dbt, a data build tool, to create a few database views from CloudQuery policies. You will only need to run it once, so you can install it locally.

AWS to PostgreSQL

First, we will set up the syncs from AWS to PostgreSQL database on Neon. You will need the connection string and the AWS keys.
Log in at cloud.cloudquery.io and start creating your first sync from the Syncs tab.
Create a new sync
First, we need the destination database to sync to. Since the default destination is PostgreSQL, just paste the connection string into the designated input for the POSTGRESQL_CONNECTION_STRING variable.
Add a secret
That’s all for the destination, let’s move on to defining the source. From the Source Plugin dropdown, select AWS. The example YAML configuration will load. Edit the tables to contain the following list of tables:
tables:
    - aws_cloudwatch_alarms
    - aws_cloudwatchlogs_metric_filters
    - aws_ec2_network_acls
    - aws_ec2_security_groups
    - aws_sns_subscriptions
    - aws_iam_credential_reports
    - aws_iam_password_policies
    - aws_iam_user_access_keys
    - aws_iam_users
    - aws_autoscaling_groups
    - aws_cloudtrail_trail_event_selectors
    - aws_cloudtrail_trails
    - aws_codebuild_projects
    - aws_config_configuration_recorders
    - aws_apigateway_rest_api_stages
    - aws_apigateway_rest_apis
    - aws_apigatewayv2_api_routes
    - aws_apigatewayv2_api_stages
    - aws_apigatewayv2_apis
    - aws_cloudfront_distributions
    - aws_efs_access_points
    - aws_elasticbeanstalk_environments
    - aws_elbv1_load_balancers
    - aws_elbv2_load_balancer_attributes
    - aws_elbv2_load_balancers
    - aws_iam_accounts
    - aws_rds_clusters
    - aws_s3_accounts
In the Secrets section, add your access key ID and the security key.
Finally, let’s schedule the sync. On the next step of the wizard, select the Daily schedule. Choose the allocated resources - 0.5 CPU and 1 GB RAM should be fine for most of the cases. Click Save and Run and the sync will start.

Creating database views

The sync should pull data from AWS directly in the Neon database. When this is done, we need to create database views the dashboards in Grafana will use.
CloudQuery has built a set of Transformations that will create these views. Download the AWS Compliance (Free) and extract it. Since we will use dbt to create the views, we will need to make sure it can connect to the database. dbt looks for a file with a profile which defines how to connect to relevant databases. By default, it searches for the profiles.yml file in the local directory and falls back to ~/.dbt/. Read more about profiles in the dbt documentation.
Your profiles.yml file should look like this:
config:
  send_anonymous_usage_stats: False
  use_colors: True
aws_compliance: # this should match the profile name in your dbt_project.yml, see step 5.
  target: postgres
  outputs:
    postgres:
      type: postgres
      host: "your postgres host"
      user: "postgres user name"
      pass: "postgres password"
      port: 5432
      dbname: "database name"
      schema: public
      threads: 4
With the connection configured, you can now run dbt compile and dbt run commands from the directory where you extracted the AWS Compliance transformation.
The dbt run will run all the dbt models and create views in your destination database as defined in the models.
Now you can query the views directly and export in various formats such as CSV or HTML, all with standard psql, and of course visualize them in your favorite BI tool.

Connecting Grafana

For AWS Transformations, we offer free Grafana dashboards that you can use as a starting point.
First, you need to add the database as a data source into Grafana. Read this Grafana guide on how to do this.
Download AWS Compliance Dashboard and extract the zip file. Find the dashboard.json file in the extracted directory (in aws_compliance/grafana/postgres) and import it into your Grafana instance.
At the top of the dashboard, select the data source to be the PostgreSQL database with data synced by CloudQuery.
Now you should see a dashboard similar to this:
Add a secret

Wrapping up

That’s it! You have set up a regular daily sync to a database hosted by Neon with a dashboard on Grafana Cloud that will show you your security issues.
Note that if you decide to update to a newer version of the AWS plugin, you may need to run the dbt transformations again.
Be sure to check out our other Transformations and Visualizations on CloudQuery Hub!
Subscribe to product updates

Be the first to know about new features.