How to Visualize CloudQuery Data with Apache Superset (Preset)
In this guide, we will walk you through how to set up CloudQuery to build your cloud asset inventory in PostgreSQL and connect it to Apache Superset (opens in a new tab) (or a hosted version such as preset.io (opens in a new tab)) for visualization, monitoring and reporting.
- ETL (Extract-Transform-Load) ingestion layer: CloudQuery (opens in a new tab)
- Datastore: PostgreSQL
- Data Visualization and Exploration Platform: Apache Superset (opens in a new tab)
- Raw SQL access to all your cloud asset inventory to create views or explore any questions or connection between resources.
- Multi-Cloud Asset Inventory: Ingest configuration from all your clouds to a single datastore with a unified structure.
- Avoid yet-another-dashboard fatigue: Use you existing Apache Superset (opens in a new tab) setup as a best in class visualization platform to build cloud asset inventory.
If you are already familiar with CloudQuery, take a look at how to deploy it to AWS on Amazon Aurora and EKS at here (opens in a new tab).
Apache Superset (opens in a new tab) is an open source Data Visualization and Data Exploration Platform so there are a number of ways to deploy it:
- Self-hosted (local, docker, k8s): Official guide.
- SaaS/managed: Preset.io (opens in a new tab)
- Cloud Marketplaces:
By default RDS Aurora instances are not accessible from the public internet. In order to enable access by Preset you are going to have to update your security groups to include the IP ranges that Preset publishes (or alternatively look at the
publicly_accessible variable in our terraform modules - aws (opens in a new tab) , GCP (opens in a new tab)). If you deploy it in your own VPC you might be able to connect it in your private network.
Now you can connect preset to your PostgreSQL database by clicking “New Database”, choosing PostgreSQL and filling-in the following form:
If all your credentials are correct, click connect and then finish
To be able to create charts that you will later add to a dashboard you first need to create a dataset (for full details see Creating Charts and Dashboards in Superset (opens in a new tab)).
In this guide we will create a dataset from a view (opens in a new tab) (also, see blog (opens in a new tab)) we already created on our database but you can also create a dataset from any complex query you can think off in Superset SQL Lab.
Now you are ready to create your first chart on top of our AWS Resource View (opens in a new tab) (Also, see blog (opens in a new tab))! Choose the dataset you created in the last step (in this case
aws_resources) and choose the chart type you want to create (we will take the Bar Chart but you can also change this in the chart screen).
If you are familiar with the Superset UI this step is pretty easy where you just choose or drag and drop the columns you want to visualize (and you can always revert to plain SQL if needed).
Click Save and choose the Dashboard you want to add the chart (or create a new one if you don’t have one) and then click Save again or Save & Go to Dashboard
You can now repeat steps 4-5 or 3-5 to create all charts you need and add them to your dashboard/s.
Now we are at the final stage where you can edit, resize and tweak the charts/widgets so it looks something like this:
You can add Filters to make the dashboard more interactive and you can now share this with other team members, send links, image of dashboards as well as periodic alerts and reports via Superset.
In this post we showed you how to build an open-source cloud asset inventory with CloudQuery as the ETL (Extract-Transform-Load) / data-ingestion layer and Apache Superset as the visualization and monitoring platforms. This approach eliminates the yet-another-dashboard fatigue and gives you the ability to pick the best-in-class visualization tools or reuse your current stack.