security
tutorial

How to Build a CSPM with Grafana and CloudQuery

Tim Armstrong

Tim Armstrong

Cloud adoption and usage of cloud security tooling have exploded over the last few years. With that comes a need for managing the security posture of cloud computing usage. Grafana and CloudQuery are partnering to showcase how you can build an extensible Cloud Security Posture Management (CSPM) solution to assist with securing cloud infrastructure.
Screenshot of a Grafana dashboard with data from CloudQuery
In this tutorial, we will guide you through building a CSPM solution using Docker Compose for a local development environment. You’ll leverage CloudQuery PostgreSQL DBT and Grafana to create an integrated system that simplifies security compliance and monitoring. By following along, you'll learn how to set up each component and build powerful customizable dashboards. You will get practical hands-on experience with these technologies, understanding how they fit together to enhance cloud security and deploy a robust CSPM solution in both local and production environments.

The Architecture of a CSPM

You can think of a CSPM as having the following three components:

ELT Layer

The ELT (Extract Load Transform) layer is a crucial component of a CSPM architecture. It extracts data from various cloud resources, loads it into a centralized storage system, and transforms it into a structured format for analysis. This layer ensures that raw data is converted into actionable insights, enabling effective monitoring and compliance. By efficiently handling large volumes of data, the ELT layer supports real-time security assessments and policy enforcement within the CSPM framework. CloudQuery extracts the data from platform APIs and loads it into the Data Warehouse. To do this, CloudQuery uses its plugins to interact with various services’ APIs and store data in destination databases.

Queries and Insights

Storing the data in a data warehouse based on a solution like Postgres enables users to build complex queries and derive insights from their data. For example, industry standards such as CIS and PCI-DSS define rules and best practices that can be used to indicate the security posture of cloud infrastructure. Using these, you can generate findings and actionable insights on cloud infrastructure that identify risks and vulnerabilities. To make life easier, CloudQuery offers prebuilt transformations using DBT.

Analytics, Alerting, and Visualization

The third component of a CSPM is the visualization and presentation of the infrastructure data and the queries and actionable insights. This can be useful to prioritize remediation work, gain an overall understanding of security posture across cloud infrastructure, and even help with reporting and compliance. With Grafana’s easy-to-build dashboarding, alerting, and visualization features, it is a logical choice here. The fastest way to adopt Grafana is through Grafana Cloud, which includes a scalable managed backend for metrics, logs, and traces. CloudQuery features prebuilt dashboards for Grafana.
Diagram illustrating a data pipeline using CloudQuery for ELT (Extract Load Transform). Source data from AWS, Google Cloud, and Azure is loaded into CloudQuery, which then transfers the data to a PostgreSQL data warehouse. The data warehouse is connected to dbt (data build tool) for transformations and to Grafana for analytics, alerting, and visualization.

Building the CSPM

Now let’s get into building an integrated CSPM solution using Docker Compose, CloudQuery, PostgreSQL, DBT, and Grafana.
Diagram illustrating the data flow and integration process using CloudQuery. AWS Infrastructure Data is synchronized by CloudQuery, which then distributes the data to three components: PostgreSQL for data storage, DBT (Data Build Tool) for data transformation, and Grafana for data visualization.
To keep things simple, we’re going to build a local development environment using Docker Compose. However, in a production environment, we recommend that you use the available cloud offerings for each component (CloudQuery Cloud, any managed Postgres service, DBT Cloud, and Grafana Cloud) to reduce the operational workload involved with hosting it yourselves.
Note: If you have any questions or encounter an issue when following along with this post, the best place to get help is to join the CloudQuery Discord.

Getting Started With CloudQuery

To get started with CloudQuery, you will need to sign up for a CloudQuery Cloud account. Once you have a CloudQuery account, you’ll need to go to Team Settings and then API Keys to generate a key. This key will enable your CloudQuery instance to download plugins and fetch the licenses for any premium plugins (as needed).

Setting up CloudQuery, Postgres, and Grafana with Docker Compose

Now you have that in place, let’s create a docker_compose.yml file.
Note: You can find final versions of all of these configuration files at the end of this post.
The first thing you need to define is the services:
services:
  cloudquery:
    image: ghcr.io/cloudquery/cloudquery:latest
    environment:
      CLOUDQUERY_API_KEY: YOUR_API_KEY_GOES_HERE
      AWS_ACCESS_KEY_ID: YOUR_ACCESS_KEY_ID
      AWS_SECRET_ACCESS_KEY: YOUR_SECRET_ACCESS_KEY
      AWS_SESSION_TOKEN: YOUR_SESSION_TOKEN
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - "sync"
      - "/cloudquery_config.yml"
    configs:
      - cloudquery_config.yml
  db:
    image: postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: cspm
    ports:
      - "5432:5432"
    volumes:
      - db:/var/lib/postgresql/data
  dbt:
    image: ghcr.io/dbt-labs/dbt-postgres:1.7.2
    depends_on:
      - db
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - "run"
  grafana:
    image: grafana/grafana
    restart: unless-stopped
    depends_on:
      - db
    ports:
      - 3000:3000
    volumes:
      - grafana:/var/lib/grafana
configs:
  cloudquery_config.yml:
    file: ./config.yml
volumes:
  db:
    driver: local
  grafana:
    driver: local
In this Docker Compose file, the CloudQuery job is defined, as well as the Postgres database instance, an instance of DBT, and an instance of Grafana. Make sure you replace the CLOUDQUERY_API_KEY as appropriate.
For this example, you’ll be using the AWS plugin for which you’ll need:
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_SESSION_ACCESS_TOKEN
These can all be retrieved by following the AWS documentation.
Note: If you want us to cover GCP or Azure CSPMs, let us know on our Discord.
Next, in your Docker Compose config file, you need to add the config declarations for CloudQuery:
configs:
  cloudquery_config.yml:
    file: ./config.yml
This tells Docker Compose to supply the config file (that you’ll create in the next step) to the CloudQuery container.
The final piece of the Docker Compose file you need to define is the volumes, which will enable us to maintain our Grafana state and Postgres database even if the containers get stopped.
volumes:
  db:
    driver: local
  grafana:
    driver: local

Building the CloudQuery Config

Now that you have the Docker Compose file ready, it’s time to write the CloudQuery config file. You can pick up the basic configuration for our chosen cloud platforms from the CloudQuery Hub. For this tutorial, you’re using the CloudQuery AWS plugin. In the contents menu on the left-hand side, you’ll see Configuration. If you click that, it’ll bring you down to the basic example. Copy that into a new file called config.yml.
Now, we would recommend that if you want to build a full-fledged CSPM, you will want to use more tables than just the default aws_ec2_instances for your Compliance Dashboards, so replace the tables line with the following:
tables:
  - "aws_apigateway_rest_api_stages"
  - "aws_apigatewayv2_api_stages"
  - "aws_apigatewayv2_api_routes"
  - "aws_autoscaling_groups"
  - "aws_codebuild_projects"
  - "aws_config_configuration_recorders"
  - "aws_cloudwatch_alarms"
  - "aws_cloudtrail_trail_event_selectors"
  - "aws_cloudwatchlogs_metric_filters"
  - "aws_cloudfront_distributions"
  - "aws_iam_accounts"
  - "aws_iam_credential_reports"
  - "aws_iam_password_policies"
  - "aws_iam_users"
  - "aws_ec2_network_acls"
  - "aws_ec2_security_groups"
  - "aws_efs_access_points"
  - "aws_elasticbeanstalk_environments"
  - "aws_elbv1_load_balancers"
  - "aws_elbv2_load_balancers"
  - "aws_rds_clusters"
  - "aws_sns_subscriptions"
  - "aws_s3_accounts"
Next, you’ll need a destination plugin, so head back to the CloudQuery Hub, click Explore, and then Destinations. For this example, you’ll be using PostgreSQL, so find that using the search or by scrolling down the list. However, you can sync your AWS data to any other destination, and if your database isn’t there, you can build your own custom plugin! At the bottom of the config file, place a new line that contains --- and paste in the example config for the Postgres plugin. Which should look something like this:
# enable_api_level_tracing: false
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v8.0.8"
  spec:
    connection_string: "postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:5432/${POSTGRES_DB}?sslmode=disable"
And with that, the CloudQuery Config is ready. Now from your terminal, you can run docker compose up. This will start the Postgres and Grafana instances and will run the CloudQuery job until it is complete, but the DBT instance will fail.

Prebuilt Transformations (Queries and Frameworks)

The DBT container failed because you haven’t given it any configuration yet. So let’s fix that next.
To make this easier, CloudQuery offers many data transformations, including security and compliance frameworks such as PCI_DSS, CIS, and Foundational Security Best Practices as DBT Projects. To start, go to the Transformations section of CloudQuery Hub and select AWS Compliance. Go ahead and download the pack and extract it to your project folder. Next, you’ll need to add a volumes declaration to the dbt element of our Docker Compose file. In this, you’ll tell Docker where to mount this DBT project folder. This should look something like this:
volumes:
  - type: bind
    source: ./cloudquery_transformation_aws-compliance-free_vX.X.X
    target: /usr/app
Note: If you’re copying this sample directly into your Docker Compose file, make sure you set the version number to match the one you’ve downloaded.
Now, you need to provide DBT with a profile. DBT uses profiles to define how to connect to the relevant databases for the project. To do this, add a configs declaration below the volumes one. It should look something like this:
configs:
  - source: dbt-profiles.yml
    target: /root/.dbt/profiles.yml
Finally, you need to define the dbt-profiles.yml file itself:
config:
  send_anonymous_usage_stats: False
  use_colors: True

aws_compliance:
  target: postgres
  outputs:
    postgres:
      type: postgres
      host: "{{ env_var('POSTGRES_HOST') }}"
      user: "{{ env_var('POSTGRES_USER') }}"
      pass: "{{ env_var('POSTGRES_PASSWORD') }}"
      port: 5432
      dbname: "{{ env_var('POSTGRES_DB') }}"
      schema: public
      threads: 4
Run docker compose up dbt to launch Postgres, Grafana, and (re)run CloudQuery and the DBT transformations.

Load Grafana Dashboard

Now for the dashboard that pulls information dynamically from our PostgreSQL database. Fortunately, CloudQuery provides a range of pre-built dashboards in CloudQuery Hub. As you’re using AWS in this proof of concept, select the AWS Compliance visualization and then click Download Now and extract the zip file.
The Grafana instance that launched as part of our docker-compose should be available at localhost:3000. If you haven’t already, you might need to set a password for the admin account before proceeding. To import the prepared dashboard, select the hamburger menu from the top left of the window and click dashboards. Then click the blue New button on the top right and then Import. In the extracted zip file, you’ll need to navigate to build>aws_compliance>grafana>postgres where you’ll find a file called compliance.json. Drag this into the Upload dashboard JSON file region and click import. This will load your dashboard and default to the Foundational Security Best Practices framework.

Building custom dashboards

Now, obviously using a dashboard that’s been prepared for you is better than nothing. But ultimately, one of the key benefits of using Grafana is that you have unlimited flexibility in how you display your data. Perhaps you want to see an overview of all the Policy Pass/Fail Distributions at the same time or a dashboard that just shows the failing policy results so you know what to prioritize during the next standup. Or with a little extra work in DBT, you could even build a time series chart to show off your team's improvements. After all, raising the visibility of security issues without sounding like a squeaky wheel is how you get the time and resource allocations you need to fix them.

Conclusion

By following this tutorial, you have successfully built an extensible Cloud Security Posture Management (CSPM) solution powered by CloudQuery, PostgreSQL, DBT, and Grafana. You’ve learned how to:
  • Sync AWS infrastructure data using CloudQuery.
  • Store and manage data in PostgreSQL.
  • Transform data with DBT.
  • Visualize data through customizable dashboards in Grafana.
This setup not only enhances your cloud security but also provides valuable insights and flexibility for monitoring compliance. Ready to dive deeper? Join the CloudQuery Discord community to connect with other users and experts. Alternatively, try out CloudQuery locally with our quickstart guide or explore CloudQuery Cloud (currently in beta) for a more scalable solution.

FAQs

Q: What is CSPM? A: Cloud Security Posture Management (CSPM) is a solution that helps manage the security posture of cloud infrastructure by continuously monitoring, identifying risks, and ensuring compliance with industry standards.
Q: Why use Docker Compose for building the CSPM solution? A: Docker Compose allows for easy setup and management of the different components in isolated containers, simplifying the local development environment.
Q: What role does CloudQuery play in this setup? A: CloudQuery extracts data from AWS infrastructure and loads it into PostgreSQL, making it available for transformation and visualization.
Q: How does DBT fit into the CSPM architecture? A: DBT transforms the raw data stored in PostgreSQL into structured formats that can be easily analyzed and visualized in Grafana.
Q: What kind of visualizations can be created in Grafana with this setup? A: Grafana can create dashboards that show policy compliance status, security posture overviews, and time series charts of infrastructure improvements.
Q: What is the purpose of the CloudQuery config file? A: The CloudQuery config file specifies which data tables to sync from AWS and the connection details for the PostgreSQL database.
Q: Is there a scalable option for using CloudQuery? A: Yes, you can explore CloudQuery Cloud, currently in beta, for a more scalable solution.

Code Samples

docker_compose.yml

services:
  cloudquery:
    image: ghcr.io/cloudquery/cloudquery:latest
    environment:
      CLOUDQUERY_API_KEY: YOUR_API_KEY_GOES_HERE
      AWS_ACCESS_KEY_ID: YOUR_ACCESS_KEY_ID
      AWS_SECRET_ACCESS_KEY: YOUR_SECRET_ACCESS_KEY
      AWS_SESSION_TOKEN: YOUR_SESSION_TOKEN
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
    - "sync"
    - "/cloudquery_config.yml"
configs:
  cloudquery_config.yml:
    file: ./config.yml
  db:
    image: postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: cspm
    ports:
      - "5432:5432"
    volumes:
      - db:/var/lib/postgresql/data
  dbt:
    image: ghcr.io/dbt-labs/dbt-postgres:1.7.2
    depends_on:
      - db
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - "run"
    volumes:
      - type: bind
        source: ./cloudquery_transformation_aws-compliance-free_vX.X.X
        target: /usr/app
    configs:
      - source: dbt-profiles.yml
        target: /root/.dbt/profiles.yml
  grafana:
    image: grafana/grafana
    restart: unless-stopped
    depends_on:
      - db
    ports:
      - 3000:3000
    volumes:
      - grafana:/var/lib/grafana

config.yml

CloudQuery configuration file.
kind: source
spec:
# Source spec section
name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: "v26.4.0"
  tables:
    - "aws_apigateway_rest_api_stages"
    - "aws_apigatewayv2_api_stages"
    - "aws_apigatewayv2_api_routes"
    - "aws_autoscaling_groups"
    - "aws_codebuild_projects"
    - "aws_config_configuration_recorders"
    - "aws_cloudwatch_alarms"
    - "aws_cloudtrail_trail_event_selectors"
    - "aws_cloudwatchlogs_metric_filters"
    - "aws_cloudfront_distributions"
    - "aws_iam_accounts"
    - "aws_iam_credential_reports"
    - "aws_iam_password_policies"
    - "aws_iam_users"
    - "aws_ec2_network_acls"
    - "aws_ec2_security_groups"
    - "aws_efs_access_points"
    - "aws_elasticbeanstalk_environments"
    - "aws_elbv1_load_balancers"
    - "aws_elbv2_load_balancers"
    - "aws_rds_clusters"
    - "aws_sns_subscriptions"
    - "aws_s3_accounts"
  destinations: ["postgresql"]
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v8.0.8"

  spec:
    connection_string: "postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:5432/${POSTGRES_DB}?sslmode=disable"

dbt-profiles.yml

config:
  send_anonymous_usage_stats: False
  use_colors: True

aws_compliance:
  target: postgres
  outputs:
    postgres:
      type: postgres
      host: "{{ env_var('POSTGRES_HOST') }}"
      user: "{{ env_var('POSTGRES_USER') }}"
      pass: "{{ env_var('POSTGRES_PASSWORD') }}"
      port: 5432
      dbname: "{{ env_var('POSTGRES_DB') }}"
      schema: public
      threads: 4
Subscribe to product updates

Be the first to know about new features.


© 2024 CloudQuery, Inc. All rights reserved.