CloudQuery

Cloud Asset Inventory
Engineering
Platform Engineering

6 Cloud CMDB Best Practices for Platform Engineers (2026 Guide)

You manage 4,782 resources across 6 cloud accounts. Can you answer these questions in under 5 minutes?
  • Which S3 buckets are publicly accessible right now?
  • Which IAM roles haven't been used in 90 days?
  • What changed in production yesterday between 2 p.m. and 3 p.m.?
That RDS database from the abandoned prototype? Still running. The EC2 instance someone spun up "just to test something real quick"? Racked up $4,000 last month. Your CMDB knows about none of this because Bob updated it in February 2023 and nobody's touched it since.
Traditional CMDBs were built for servers that lived 3-5 years, not containers that live 3 minutes. Manual updates fail within weeks as engineers forget to document changes (then forget the CMDB exists entirely).
This guide covers 6 cloud CMDB best practices based on ITIL standards and real-world implementations managing thousands of cloud accounts. The architectural patterns apply regardless of tooling. Whether you're evaluating cloud CMDB solutions or improving an existing implementation, these practices will help you build infrastructure visibility that platform engineering teams actually use.

Why Traditional CMDB Approaches Fail in Cloud Environments #

Most cloud CMDB implementations face these problems:
Data Staleness:
Traditional CMDBs sync weekly or daily. The majority of containers in production have short lifespans, often measured in minutes rather than hours or days [Datadog Container Report]. Lambda functions execute in milliseconds and their instances are recycled after periods of inactivity [AWS Lambda documentation]. AWS spot instances can be terminated with just a 2-minute warning [AWS EC2 documentation]. Weekly discovery misses ephemeral resources entirely.
Manual Update Death Spiral:
Week 1: Engineers manually enter new infrastructure. Week 3: Engineers forget to update entries. Month 3: Cloud CMDB data is 40% stale; nobody trusts it. Month 6: Project abandoned.
Multi-Cloud Blind Spots:
Separate tools for AWS, GCP, Azure. No unified query interface across clouds. Platform teams maintain 3+ different systems.
High Costs, Low Adoption:
Enterprise CMDB solutions can cost tens to hundreds of thousands of dollars annually in licensing fees. Proprietary UIs that platform engineers don't want to learn. No integration with existing tools.
Compliance Gaps:
No historical data retention. Can't answer auditor questions: "Prove S3 buckets had encryption on October 15." Incident investigations limited: "What changed yesterday at 3 p.m.?"

Best Practice 1: Automate Discovery from Cloud APIs #

ITIL recommends automated discovery tools to maintain CMDB accuracy [ITIL CMDB guidance]. Manual discovery doesn't scale to cloud velocity.
Why manual updates fail: Engineers document infrastructure when it's created. Three weeks later, autoscaling adds 50 instances. Six weeks later, someone deletes half the dev environment. Nobody updates the CMDB. Month 3: your cloud CMDB data is 40% stale. Month 6: engineers start calling it "the spreadsheet that lies."
Why automation works: AWS, GCP, and Azure already track every resource via APIs. Extract this data automatically rather than asking engineers to document what the cloud already knows.
Implementation: Tools like CloudQuery, Steampipe, and AWS Config extract cloud data automatically via API calls. Choose which resources to track (EC2 instances, S3 buckets, IAM roles) and sync frequency. The tool handles API authentication, pagination, and rate limiting.

Best Practice 2: Start with High-Value Resources, Expand Deliberately #

ITIL says: Include only Configuration Items (CIs) necessary for service delivery [ITIL CMDB standards]. Translation: don't sync everything just because you can.
AWS has 240+ services [AWS services]. GCP and Azure each have 100+. The "let's sync everything!" approach leads to:
  • API rate limits (AWS will politely throttle you, then less politely)
  • Database bloat (10 million low-value records nobody queries)
  • Query performance death spiral (finding signal in noise becomes impossible)
  • Sync jobs taking hours instead of minutes
Start here: IAM policies/roles, security groups, EC2 instances, RDS databases, S3 buckets. Expand when teams ask: "Can we track ACM certificates?" (security team) or "Can we get billing resources?" (FinOps team).

Best Practice 3: Use Tiered Sync Frequencies #

Not all resources need the same sync frequency. Security teams need IAM changes visible within 30 minutes. FinOps teams analyzing cost trends can wait for daily snapshots. One-size-fits-all sync schedules waste API quota and money.
TierFrequencyResourcesWhyTrade-off
Critical15-30 minIAM policies/roles, security groups, NACLs, WAF rules, public S3 buckets, TLS certificatesFast threat detection (unauthorized IAM role created = you want to know in 30 min, not 24 hours)Higher API costs, justified by security value
OperationalHourlyEC2, RDS, S3 configs, Lambda, Kubernetes clusters, ECS servicesTroubleshooting ("Which EC2 instances are running?") and cost attributionModerate API costs, operational queries tolerate 1-hour staleness
BaselineDailyDev/test environments, archival storage, VPC configs, CloudTrail logsTrend analysis ("How many EC2 instances 6 months ago?") and compliance snapshotsMinimal API costs, broadest resource coverage
Implementation: Use your scheduler (Kubernetes CronJobs, AWS EventBridge, systemd timers) to run different configs at different intervals:
*/30 * * * * sync-tool sync critical-resources-config  # Every 30 min
0 * * * * sync-tool sync operational-resources-config  # Hourly
0 2 * * * sync-tool sync baseline-resources-config     # Daily at 2am
Real-world scale: Organization with 1,000 AWS accounts syncs millions of IAM/security group records every 30 minutes, tens of millions of EC2/RDS/S3 records hourly, and hundreds of millions of historical records daily. Tools handle API rate limiting automatically.

Choosing Your Cloud CMDB Approach #

Before implementing these practices, choose your technical approach. The architectural patterns (tiered sync, centralized database, SQL queries) apply regardless of tooling.
ApproachWhat It IsStrengthsLimitationsBest For
CloudQueryOpen-source ELT tool syncing cloud APIs to your database (PostgreSQL, BigQuery, Snowflake)Pre-built connectors for 200+ services (AWS, GCP, Azure, K8s), incremental sync, you control the databaseRequires infrastructure (K8s/Docker or managed platform), YAML config learning curve100+ cloud accounts, multi-cloud, teams comfortable with ELT tools
SteampipeQuery live cloud APIs directly via PostgreSQL foreign tablesZero ETL, simple setup, strong compliance focus, active communityQueries hit live APIs (slower, rate limits), no historical data without external storageSecurity teams, compliance audits, proof-of-concepts
AWS Config Advanced QueryAWS-native service recording resource configsFully managed, native AWS compliance integration, built-in conformance packsAWS-only, proprietary query language (not SQL), 7-year retention limitAWS-only shops already invested in AWS Config
Build Your OwnCustom scripts with cloud SDKs (boto3, google-cloud-python)Complete control, no third-party dependencies, optimize for your use case3-6 months development time, ongoing maintenance burden, custom error handling/rate limitingLarge teams with specific requirements, strict data residency rules
Recommendation: For most teams managing 50+ cloud accounts: CloudQuery (historical data + multi-cloud) or Steampipe (current state + compliance). AWS Config if AWS-only. Build-your-own only if you have bandwidth to maintain it.

Best Practice 4: Design for Multi-Account and Multi-Cloud from Day One #

Most organizations use multiple AWS accounts (10+ accounts for dev, staging, prod, per-team, per-region) [AWS multi-account best practices]. Many run multi-cloud (AWS + GCP for redundancy, Azure for that one service your VP insisted on). Single-account cloud CMDB implementations look great in the proof-of-concept, then collapse spectacularly when account #11 appears.
The centralized data model: All cloud accounts sync to a single destination database (PostgreSQL, BigQuery, Snowflake). account_id becomes a dimension in every table, enabling cross-account queries like "Find all unencrypted RDS instances across all AWS accounts."
Tag everything consistently:
  • account_id (AWS account number, GCP project ID, Azure subscription ID)
  • environment (prod, staging, dev, test)
  • owner_team (for when you need to know who to blame—sorry, "collaborate with")
  • cost_center (financial attribution)
CloudQuery extracts native cloud tags automatically. Enrichment happens at query time (join cloud CMDB data with team directory, cost allocation tables).
Multi-cloud: CloudQuery supports AWS, GCP, Azure, and Kubernetes sources simultaneously. Store resources from all providers in a unified schema. Normalize where it makes sense (all compute instances → instances table). Accept cloud-specific differences where forcing unification doesn't add value (looking at you, Azure naming conventions).
Cross-account queries:
Note: All SQL examples use PostgreSQL syntax. Adapt for BigQuery or Snowflake as needed.
-- Cloud CMDB query: find public S3 buckets across all AWS accounts
SELECT
  account_id,
  arn,
  region,
  tags->>'Owner' as owner,
  tags->>'Environment' as environment
FROM aws_s3_buckets
WHERE
  block_public_acls = false
  OR block_public_policy = false
ORDER BY account_id, region;
Multi-account configuration:
AWS Organizations role assumption: Configure one role in the management account, CloudQuery assumes roles in member accounts automatically. No per-account credential management required (one less thing to forget to rotate).
GCP projects auto-discovered via Resource Manager API. Azure subscriptions discovered via Azure Resource Manager.
# AWS multi-account config
kind: source
spec:
  name: aws
  path: cloudquery/aws
  destinations: ['postgresql']
  spec:
    # Assume role in all organization accounts
    org:
      member_role_name: 'CloudQueryRole'

Best Practice 5: Integrate with Security, FinOps, and Incident Management Tools #

Your cloud CMDB isn't an end-user application. It's data infrastructure that feeds the tools platform teams already use. Engineers don't want another UI to learn when they already have Grafana dashboards, PagerDuty runbooks, and Backstage service catalogs.

Security: Datadog, Elastic, Splunk #

Sync cloud CMDB data to your SIEM for correlation. CloudQuery syncs to Datadog, Elasticsearch, and SQL databases that feed Splunk.
Alert on risky changes detected in your CMDB data:
  • Public S3 bucket created (someone's getting creative with permissions)
  • IAM policy grants *:* permissions (bold move)
  • TLS certificate expiring in <30 days
  • Security group allows 0.0.0.0/0 on port 22 (the "I'll fix this later" that never gets fixed)

FinOps: AWS Cost Explorer, GCP Billing, Azure Cost Management #

Join cloud CMDB resource data with billing data to answer: "Which team launched that $4,000/month EC2 instance nobody's SSH'd into since March?" CloudQuery syncs billing data from AWS, GCP, and Azure.
  • Attribute costs to teams via resource tags
  • Find idle RDS databases (zero connections for 7 days = expensive paperweight)
  • Identify rightsizing opportunities (EC2 with <5% CPU = money pit)
SQL Query Example:
-- FinOps: unused RDS instances (join cloud CMDB + CloudWatch metrics)
WITH connection_metrics AS (
  SELECT
    db_instance_identifier,
    MAX(database_connections) as max_connections
  FROM cloudwatch_rds_metrics
  WHERE timestamp > NOW() - INTERVAL '7 days'
  GROUP BY db_instance_identifier
)
SELECT
  r.account_id,
  r.arn,
  r.db_instance_class,
  r.estimated_monthly_cost,
  m.max_connections
FROM aws_rds_instances r
LEFT JOIN connection_metrics m
  ON r.db_instance_identifier = m.db_instance_identifier
WHERE m.max_connections = 0 OR m.max_connections IS NULL
ORDER BY r.estimated_monthly_cost DESC;

Developer Portals: Backstage, Port #

Show infrastructure dependencies in service catalogs. Link services to underlying cloud resources: "This service uses RDS instance db-prod-users in account 123456789012." CloudQuery syncs to PostgreSQL which Backstage reads directly.

Incident Management: PagerDuty, Opsgenie #

Query during incidents: "What changed in production between 2 p.m. and 3 p.m. yesterday?" Feed infrastructure changes into incident timelines. Post-mortem queries: "What was the security group configuration when things exploded?" CloudQuery syncs cloud resource data from AWS, GCP, Azure, and Kubernetes.

BI & Dashboards: Grafana, Looker, Metabase #

CloudQuery syncs to PostgreSQL, BigQuery, Snowflake, ClickHouse, and 50+ other destinations. Standard SQL means any BI tool connects.

Best Practice 6: Retain Historical Data for Compliance and Debugging #

Auditors ask: "Prove all S3 buckets had encryption enabled on October 15." Engineers ask: "What changed between 2 p.m. and 3 p.m. yesterday?" You need point-in-time data.
Compliance retention requirements:
  • SOC 2: 1-2 years | ISO 27001: 2-3 years | HIPAA: 6 years | Financial services: 7 years
Append-only tables: Every sync writes new records with sync_time timestamps. Old records stick around. Simple to implement, easy to query specific points in time. Downside: storage grows forever (partition by month, archive old data to S3 Glacier).
Point-in-time queries:
-- Cloud CMDB: S3 bucket configuration as of October 15, 2024 at 3pm
-- Get most recent sync before target time for each bucket
WITH ranked_syncs AS (
  SELECT
    account_id,
    arn,
    block_public_acls,
    encryption_rules,
    sync_time,
    ROW_NUMBER() OVER (
      PARTITION BY arn
      ORDER BY sync_time DESC
    ) as rn
  FROM aws_s3_buckets
  WHERE sync_time <= '2024-10-15 15:00:00'
)
SELECT
  account_id,
  arn,
  block_public_acls,
  encryption_rules
FROM ranked_syncs
WHERE rn = 1;
Thousands of AWS accounts with 2-year retention = billions of records. PostgreSQL, BigQuery, and Snowflake handle this scale. Partition by month, drop old partitions after retention period ends.

Implementation #

Here's how to set this up:
Step 1: Choose Your Database
  • PostgreSQL: Best for simplicity; run on AWS RDS, GCP CloudSQL, or self-hosted
  • BigQuery: Best for massive scale (billions of records); Google Cloud native
  • Snowflake: Best if already in your stack; excellent query performance
Step 2: Deploy CloudQuery
Option A: CloudQuery CLI (Open-Source)
Runs on your infrastructure (Kubernetes, Docker, VMs). Full control over sync schedules and configuration. Download CloudQuery CLI.
Option B: CloudQuery Platform (Managed Service)
Managed scheduling via UI (no CronJob management). Asset inventory explorer for visual browsing. SQL console and RESTful API built-in. Try CloudQuery Platform.
Step 3: Configure Cloud Sources
Follow CloudQuery quickstart guide to configure:
  • AWS (via AWS Organizations role assumption for multi-account)
  • GCP (via Resource Manager API for all projects)
  • Azure (via Azure Resource Manager for all subscriptions)
  • Kubernetes (via kubeconfig)
Step 4: Implement Tiered Sync Schedules
  • Critical resources (IAM, security groups): every 30 min
  • Operational resources (EC2, RDS, S3): hourly
  • Baseline resources (dev environments): daily
Step 5: Create Initial Queries
Start with high-value queries:
-- Public S3 buckets across all accounts
SELECT account_id, arn FROM aws_s3_buckets
WHERE block_public_acls = false;

-- Untagged EC2 instances
SELECT instance_id, tags FROM aws_ec2_instances
WHERE tags->>'Owner' IS NULL;
Step 6: Integrate with Existing Tools
Export cloud CMDB data to security scanner. Create Grafana dashboards from cloud CMDB SQL queries. Feed data into incident management system.

Summary #

Six practices that actually work:
  1. Automate discovery from cloud APIs (manual updates fail within weeks)
  2. Define clear scope (start security/compute; expand deliberately)
  3. Tiered sync strategies (critical 15-30 min, operational hourly, baseline daily)
  4. Multi-account design from day one (centralized database; query across accounts)
  5. Tool integration (SQL enables integration with security, FinOps, incident management)
  6. Historical data retention (2-7 years for compliance and incident investigation)
Organizations following ITIL CMDB best practices see 30% faster incident resolution [source]. ITSM teams using well-maintained CMDBs identify root causes 15% faster and resolve issues 30% faster [source].
CloudQuery provides both CLI (open-source, runs on your infrastructure) and Platform (managed service with UI, scheduling, reporting). Sync AWS, GCP, Azure, Kubernetes into PostgreSQL, BigQuery, or Snowflake. Query with standard SQL.
Deep Dive on Cloud CMDB Concepts:
Implementation Guides:

Frequently Asked Questions #

What is a cloud CMDB and why do platform engineers need it? #

A cloud CMDB (Configuration Management Database) tracks cloud infrastructure configuration items (CIs) like EC2 instances, S3 buckets, IAM roles across multi-account and multi-cloud environments. Unlike traditional CMDBs built for static servers, cloud CMDBs handle ephemeral resources (containers living minutes, Lambda functions executing milliseconds). Platform engineers need cloud CMDBs to answer questions like "Which S3 buckets are publicly accessible?" or "What changed in production yesterday at 3 p.m.?" within minutes, not hours.

How often should a cloud CMDB sync data? #

Use tiered sync strategies based on resource criticality. Critical security resources (IAM, security groups) sync every 15-30 minutes for fast threat detection. Operational resources (EC2, RDS) sync hourly for troubleshooting and cost attribution. Baseline resources (dev environments) sync daily for trend analysis. This balances data freshness against API rate limits and compute costs.

What's the difference between a cloud CMDB and a cloud asset inventory? #

Cloud asset inventory lists resources (EC2 instances, S3 buckets, IAM roles). Cloud CMDB adds configuration details, relationships between resources, historical tracking, and governance. CloudQuery provides both: asset inventory (comprehensive resource lists) and CMDB capabilities (time-series data, SQL queryability, compliance retention).

Can CloudQuery be used as a cloud CMDB? #

Yes. CloudQuery syncs cloud infrastructure data from AWS, GCP, Azure, Kubernetes into your database (PostgreSQL, BigQuery, Snowflake), providing the foundation for a modern cloud CMDB. Query with standard SQL, track historical changes, integrate with existing security/FinOps/incident management tools. CloudQuery Platform adds managed scheduling, asset inventory UI, and reporting; CLI provides open-source flexibility running on your infrastructure.

How do I handle multi-account AWS environments in my cloud CMDB? #

Use AWS Organizations role assumption. Configure one IAM role in your management account; CloudQuery assumes roles in all member accounts automatically. All accounts sync to centralized cloud CMDB database. Query across accounts with SQL: SELECT * FROM aws_s3_buckets WHERE account_id IN (...). No per-account credential management required.

What are the main cloud CMDB data quality challenges? #

Stale data (sync failures mean outdated information), missing tags (untagged resources can't be attributed to teams), incomplete coverage (not syncing critical resource types). Solutions: monitor sync success rates (target >99%), enforce tagging via AWS Config/Azure Policy, start with high-value resources (security, compute) and expand based on user requests.

How much does it cost to run a cloud CMDB for large enterprise environments? #

Costs vary by database choice and sync frequency. Self-managed approaches using CloudQuery with PostgreSQL, BigQuery, or Snowflake are significantly more cost-effective than traditional enterprise CMDB licensing. Factors include database storage (for billions of records with multi-year retention) and compute resources for running sync jobs. CloudQuery's open-source CLI and managed Platform offer flexible pricing compared to traditional CMDB solutions.

Related posts

Turn cloud chaos into clarity

Find out how CloudQuery can help you get clarity from a chaotic cloud environment with a personalized conversation and demo.


© 2025 CloudQuery, Inc. All rights reserved.