tutorial
How to Manage Resilience Strategy and Visibility with AWS Backup and CloudQuery
Jason Kao •
In this guide, we'll walkthrough how to use CloudQuery to manage a company resilience strategy utilizing AWS Backup to protect against data loss, support backup and recovery initiatives, and ensure compliance.
AWS Backup is a fully managed service that can be used for data protection at scale. By using CloudQuery, we can:
- Manage AWS Backup at scale across multiple AWS accounts.
- Monitor status of AWS Backup and compliance with company strategies or requirements.
- Monitor overall health of AWS Backup including failures, issues, and successes.
- Manage AWS Backup settings and reduce data protection issues.
An example AWS Backup Health Dashboard Powered by CloudQuery and Grafana is shown below. More information can be found here.
Walkthrough #
We'll use the following components for our architecture:
- AWS (Amazon Web Services) for infrastructure source
- CloudQuery for ETL
- PostgreSQL as the destination
- Grafana for a visualization layer
CloudQuery supports many more destinations and visualization layers that can be used in place of PostgreSQL and Grafana.
Prerequisites #
- AWS Account(s) with AWS Backup configured and running.
Step 1: Install or Deploy CloudQuery #
To get started with CloudQuery, additional information can be found in our quickstart guide.
Step 2: Sync Data from AWS to PostgreSQL #
AWS Backup supports multiple resources in AWS including RDS, EBS, S3, EFS, Redshift, Neptune, DocumentDB, and more. In this reference example, we'll sync data about AWS Backup, AWS S3, AWS DynamoDB, and AWS EC2. CloudQuery can be customized to sync data from other infrastructure in AWS as needed.
Additional information on configuring CloudQuery to sync from AWS can be found in our documentation as well as additional information on how to configure CloudQuery to sync from multiple AWS Accounts.
kind: source
spec:
# Source spec section
name: aws
path: cloudquery/aws
registry: cloudquery
version: "VERSION_SOURCE_AWS"
tables:
- aws_backup_*
- aws_s3_buckets
- aws_dynamodb_*
- aws_ec2_instances
destinations: ["DESTINATION_NAME"]
spec:
# AWS Spec section described below
regions:
- us-east-1
accounts:
- id: "account1"
local_profile: "account1-profile"
Step 3: Set up Grafana for Visualization #
Follow these Grafana guides to setup Grafana, an open-source observability and visualization tool.:
- Self-hosted Grafana Official Guide
- SaaS/Managed Grafana: Grafana
- AWS Fully-Managed Grafana: Amazon Managed Grafana
Make sure to configure Grafana to source data from the CloudQuery synced PostgreSQL data.
Step 4: Run Queries #
Now that we have CloudQuery syncing infrastructure data from AWS to PostgreSQL and a Grafana visualization layer, we can execute queries.
We'll start with the following query to search for all failed AWS Backup Jobs with their status messages for more information:
SELECT *
FROM aws_backup_jobs
WHERE state='FAILED';
Next, we can query our data for all resources backed up and protected by AWS Backup:
SELECT *
FROM aws_backup_protected_resources;
Another query we can run is to find a summary count of the types of resources (such as S3, DynamoDB, EC2) backed up by AWS Backup per AWS Account.
SELECT resource_type, COUNT(arn), account_id
FROM aws_backup_protected_resources
GROUP BY resource_type, account_id;
Additional examples that can be run are:
- Searching by tags if a tag-based backup strategy is being used.
- Querying the AWS Backup settings in each AWS account.
- Validating success of selected resources for AWS Backup.
Step 5: Create Custom Visualizations and Dashboards #
Now with Grafana, we can add custom visualizations and create dashboards to explore our Backup Data.
In this dashboard, we added the following visualizations:
- Backup Compliance Numbers by resources for DynamoDB, EC2, and S3.
- Protected Resources by AWS Backup
- Protected Resources Summarized by Type
- Backup Job Health by Status showing Failed Jobs, Successful Jobs, and Jobs with Issues.
Summary #
We have now covered how to use CloudQuery, Grafana, and PostgreSQL to visualize and manage resilience strategy with AWS. For more information, check out our additional offerings and solutions or reach out to us on Discord and GitHub.