Best Practices
CMDB
Cloud Infrastructure
Cloud CMDB vs Traditional CMDB: A 2026 Comparison
Traditional CMDBs were built for a world where servers had names. In 2026, your infrastructure doesn't even exist until someone invokes it.
That EC2 instance running your Lambda cold start? It lives for 45 seconds. Traditional CMDB discovery would schedule a scan for tomorrow morning. By then, the instance is gone, replaced by 300 others just like it.
The assumptions that made CMDBs valuable in 2006 - permanent infrastructure, scheduled changes, named servers—no longer exist. Cloud infrastructure is code. Resources are ephemeral. APIs return current state in under a second.
Traditional CMDBs scan infrastructure with agents on schedules measured in hours or days. Cloud CMDBs query APIs continuously in seconds. Traditional assumes permanence and forces cloud resources into 20-year-old ITIL data models. Cloud-native approaches treat infrastructure as ephemeral data, queried on-demand through SQL. Implementation time: 2-3 months minimum vs hours. Gartner reports 70-80% of traditional CMDB projects fail to deliver value.
Key Architectural Differences Between Traditional and Cloud CMDBs #
Traditional CMDB implementations take 2-3 months minimum according to ServiceNow timelines. Cloud CMDB implementations using tools like CloudQuery take hours.
The technical differences explain the time gap. Traditional CMDBs use agent-based discovery with scheduled scans and proprietary data models. Cloud CMDBs use API-based continuous sync where AWS DescribeInstances returns current state in under a second. Traditional ITIL Configuration Items capture around 10 attributes per resource. Native cloud schemas capture 50+ attributes per EC2 instance alone.
Terminology #
Before we go deeper, let's define what we're actually talking about. A CMDB (Configuration Management Database) stores information about IT infrastructure components and their relationships. Traditional CMDBs follow the ITIL framework using Configuration Items (CIs)—basically templates for servers, applications, and databases.
A Configuration Item (CI) is ITIL-speak for infrastructure components. You've got Server CIs, Application CIs, Database CIs. Each follows a rigid template.
The MID Server is middleware that traditional CMDBs use to run discovery probes. It sits between your CMDB and your infrastructure, constantly scanning for changes.
API-Based Discovery is the cloud-native alternative. Instead of installing agents, you just call cloud provider APIs like AWS DescribeInstances to get current infrastructure state.
Ephemeral Infrastructure means resources that exist temporarily—sometimes for just seconds. Auto-scaling instances, Lambda execution environments, spot instances. Traditional CMDB discovery can't capture these because they're gone before the next scan runs.
Part 1: Philosophical Differences #
Traditional CMDBs and cloud-native approaches start from fundamentally different assumptions about how infrastructure works.
Traditional CMDB Assumptions #
Traditional CMDBs assume infrastructure is permanent. Servers have names like "web-prod-01" and run for months or years. You own them, rack them, maintain them. Discovery agents find these servers and record their configurations. Changes flow through approval boards, Change Advisory Boards (CABs) review modifications before implementation, then discovery scans run to update the CMDB.
Since you don't control the infrastructure APIs, discovery agents probe networks to find what exists. This takes time and frequently breaks. Everything gets modeled as Configuration Items using ITIL's rigid CI types: servers, applications, databases, relationships. All infrastructure must fit these templates.
Here's a real-world example: Server "web-prod-01" deployed in 2019, running CentOS 7, hosting the customer portal. The CMDB tracks its hardware specs, installed software, network connections, and relationships to other CIs. This worked when servers lived for years.
Cloud-Native Assumptions #
Cloud-native approaches assume infrastructure is code. Terraform creates resources. Kubernetes orchestrates containers. Auto-scaling groups spin up instances based on demand. Nothing is permanent. Teams deploy 100 times per day. Auto-scaling creates and destroys instances every few minutes. Traditional change approval doesn't scale.
APIs already know everything. AWS's
DescribeInstances
returns current EC2 state in under a second. No agents. No discovery probes. Resources are ephemeral—Lambda functions execute 10,000 times today on infrastructure that didn't exist yesterday. Auto-scaling groups terminate instances after minutes of use.Here's the cloud equivalent: Lambda function executes 10,000 times today across 300 temporary compute instances. Each instance lives 30-60 seconds. Traditional CMDB discovery running every 24 hours would never capture this.
The question this reveals: Traditional CMDB answers "When did web-prod-01 last reboot?" Cloud CMDB answers "Which of the 847 EC2 instances running right now are unencrypted?"
One assumes permanence. The other assumes continuous change.
Part 2: How Does Traditional CMDB Work? #
Traditional CMDB architecture relies on layers of discovery infrastructure. You install agents on every server. Those agents report to a MID Server. The MID Server runs discovery probes. The discovery engine identifies resources. A reconciliation engine deduplicates. Data gets stored in a proprietary CMDB. Finally, the UI renders data—hours after the initial collection.
The bottlenecks are obvious once you try this with cloud infrastructure. Agent installation requires access to every system, but cloud instances scale dynamically. How do you install agents on ephemeral resources? The MID Server becomes a single point of failure - if it goes down, discovery stops. Discovery schedules create data lag. A scan that runs every 24 hours means your data is up to 24 hours stale. Reconciliation logic frequently breaks when resources change rapidly or use dynamic naming.
Time to value: minimum 2-3 months for small projects according to typical ServiceNow timelines. Full enterprise rollouts take 6-18 months.
Part 3: How Does Cloud CMDB Work? #
Cloud CMDB architecture is simpler because it doesn't fight against cloud infrastructure—it works with it. Call cloud provider APIs (AWS, GCP, Azure). Extract resource data with all attributes. Load into a SQL database (PostgreSQL, BigQuery). Query on-demand with standard SQL.
This works because APIs like
DescribeInstances
return current state in under a second. No discovery schedules. No agent installation. No reconciliation needed—the API is the source of truth. No duplicates to merge. Standard SQL means any analyst can query without special training. It scales naturally whether you have 10 instances or 10,000.Here's what a CloudQuery configuration looks like:
# configuration.yml
kind: source
spec:
name: aws
path: cloudquery/aws
destinations: [postgresql]
tables:
- aws_ec2_instances
- aws_rds_instances
- aws_s3_buckets
Run
cloudquery sync configuration.yml
and all your AWS resources are in PostgreSQL. Query with standard SQL. Get started in 5 minutes.Time to value: hours. Not months.
Part 4: Data Model Reality #
Traditional CMDBs force cloud resources into templates designed for physical servers. It doesn't work.
Traditional CMDB Forces Everything into CIs #
Take a Server CI. It captures hostname, IP address, OS—basic stuff. But it misses cloud-specific attributes: instance type, VPC, security groups, IAM role, tags, EBS volumes, network interfaces. An AWS EC2 instance has 50+ attributes. Traditional CI templates might capture 10.
Application CIs were designed for monolithic apps on physical servers. They can't represent containerized microservices, serverless functions, or service meshes. Database CIs have basic fields like database name, version, size. They miss replication topology, automatic backups, parameter groups, subnet groups—everything that makes RDS different from a MySQL server you rack yourself.
Relationship CIs require manual configuration to link resources. This breaks constantly when resources change dynamically. Want custom CI types for cloud resources? That requires expensive professional services and months of customization work.
Cloud-Native Uses Provider Data Models #
Cloud-native approaches skip the translation layer. An EC2 instance table includes all 50+ native attributes:
instance_id
, instance_type
, vpc_id
, subnet_id
, security_groups
, iam_instance_profile
, public_ip_address
, private_ip_address
, tags
, state_name
, launch_time
, availability_zone
, ebs_optimized
, monitoring_state
, network_interfaces
—everything AWS provides.RDS clusters include replication topology. Lambda functions include environment variables and permissions. S3 buckets include versioning, encryption, lifecycle policies. No generic templates. No lost attributes. Just the actual infrastructure state. Explore CloudQuery's AWS integration to see the full schema.
Querying Infrastructure: Proprietary vs SQL #
Traditional CMDB queries look like this:
-- Proprietary query language
SELECT name, ci_type, status
FROM cmdb_ci
WHERE ci_type = 'Server'
You get basic info. Cloud-specific attributes are missing.
Cloud CMDB queries use standard SQL:
-- Standard PostgreSQL
SELECT
instance_id,
instance_type,
vpc_id,
public_ip_address,
tags
FROM aws_ec2_instances
WHERE state_name = 'running'
AND tags->>'Environment' = 'production'
AND public_ip_address IS NOT NULL;
You get actual cloud resource attributes. Filter by tags, network configuration, state—everything AWS tracks.
Part 5: Operational Differences #
Daily operations reveal the architectural gap.
What Traditional CMDB Operations Look Like #
Your day starts at 6 a.m. when the discovery scheduled scan begins. By 8 a.m., reconciliation identifies 200 duplicates. You spend until 10 a.m. manually merging them. Change Board reviews updates at 2 p.m. By 4 p.m., discovery still shows last week's decommissioned server. By 6 p.m., your data is 12 hours stale.
Monthly, you spend 40 hours maintaining discovery schedules. You reconcile duplicates every week. You update custom CI types when cloud services change. You still miss ephemeral resources. Your data is still stale.
What Cloud CMDB Operations Look Like #
APIs return current state on every query. Syncs run every hour (or on-demand). No reconciliation needed. No manual intervention required. Data is current within the sync window—configurable from minutes to hours.
Monthly, you spend under 5 hours on maintenance. No duplicate reconciliation. APIs automatically include new attributes. You capture all resources, including ephemeral ones. Data stays current to the last sync.
Example Scenario: Security Incident #
Your security team needs "all public-facing servers with SSH open to 0.0.0.0/0" right now.
With a traditional CMDB, you run a discovery scan (2 hours). Wait for reconciliation (30 minutes). Export to Excel (manual). Filter for public IPs (manual). Cross-reference security group data if the CMDB even captured it. By the time you export, the data is already outdated. Total time: 3+ hours. Data stale. Incomplete.
With a cloud CMDB, you run this query:
SELECT
i.instance_id,
i.public_ip_address,
sg.group_id,
sg.ip_permissions
FROM aws_ec2_instances i
JOIN aws_ec2_security_groups sg
ON sg.group_id = ANY(i.security_groups)
WHERE i.public_ip_address IS NOT NULL
AND sg.ip_permissions @> '[{"IpProtocol": "-1", "IpRanges": [{"CidrIp": "0.0.0.0/0"}]}]'::jsonb;
Total time: under a second. Data current. Complete.
Part 6: Use Case Evolution #
What you can do with infrastructure data has changed completely.
In 2006, CMDBs tracked assets for financial audits, documented Change Advisory Board decisions, generated ITIL compliance reports, managed software licenses, and supported disaster recovery planning. Everything focused on tracking permanent assets for governance and audit.
In 2026, cloud CMDBs answer real-time security questions. Find all unencrypted S3 buckets across 100 AWS accounts:
SELECT account_id, bucket_name, region, encryption_configuration
FROM aws_s3_buckets
WHERE encryption_configuration IS NULL;
They enable cost optimization. Identify unused EBS volumes:
SELECT volume_id, size, volume_type, attachments, create_time
FROM aws_ebs_volumes
WHERE attachments = '[]'::jsonb
AND create_time < NOW() - INTERVAL '30 days';
They support automated compliance. Track configuration drift from baseline:
SELECT instance_id, tags, security_groups
FROM aws_ec2_instances
WHERE NOT tags ? 'Owner' OR NOT tags ? 'Environment' OR NOT tags ? 'CostCenter';
They provide training data for AI and ML. Feed infrastructure state to predictive models. Train on resource usage patterns. Predict scaling needs before they occur.
They validate GitOps. Verify Terraform state matches actual resources:
WITH terraform_instances AS (
SELECT unnest(ARRAY['i-abc123', 'i-def456', 'i-ghi789']) AS instance_id
)
SELECT t.instance_id
FROM terraform_instances t
LEFT JOIN aws_ec2_instances a ON t.instance_id = a.instance_id
WHERE a.instance_id IS NULL;
They accelerate incident response. Show everything that changed in the last hour:
SELECT table_name, action, changed_at, resource_id
FROM cloudquery_sync_log
WHERE changed_at > NOW() - INTERVAL '1 hour'
ORDER BY changed_at DESC;
Traditional CMDBs can't handle these use cases. They need real-time data, not daily scans. They need cloud-native attributes, not generic CIs. They need SQL querying, not proprietary UIs. They need continuous updates, not scheduled discovery.
Part 7: What Are Traditional Vendors Doing? #
Traditional CMDB vendors see the shift. Their responses reveal how hard it is to retrofit old architecture for cloud infrastructure.
They're releasing "Cloud Discovery Modules"—but these are still agent-based at the core, still require MID Servers for cloud API calls, still capture limited cloud attributes, still expensive add-ons to existing licenses, and still take months to implement.
They're adding "API Integrations"—but these are bolt-on connectors with limited support for basic attributes (security groups, tags, policies often missing), proprietary data models still force resources into CI types, and they break when cloud providers add new services.
They're rebranding products as "Agile CMDB"—but it's the same 20-year-old product with the same architecture (agents, discovery, reconciliation), same data model (Configuration Items), and same implementation timeline (months).
They're promoting "AI-Powered Discovery"—but AI can't fix architectural problems. Still requires agents and discovery schedules. Still uses proprietary data models. AI adds cost without addressing the fundamental issues.
Why do these fail? Bolt-on solutions can't fix fundamental architecture. Agent-based discovery doesn't work for ephemeral infrastructure. MID Servers create bottlenecks and single points of failure. Scheduled scans can't capture real-time cloud changes. Proprietary data models can't represent cloud-native resources—ITIL Configuration Items were designed for physical servers, and cloud resources have 50+ attributes that don't fit CI templates.
Traditional vendors optimize for perpetual licenses and professional services. Long implementation timelines generate consulting revenue. Complex products require ongoing support contracts. Switching costs keep customers locked in.
What actually works: API-first design that calls cloud provider APIs directly with no agents or MID Servers. Cloud-native data models that use provider schemas and capture all resource attributes with no CI translation layer. SQL-based querying using standard SQL with no proprietary query language and no vendor lock-in.
Stop Forcing Cloud Into 20-Year-Old Models #
Traditional CMDBs solved a real problem in 2006: tracking permanent infrastructure in on-premises data centers. That world no longer exists.
In 2026, infrastructure is code. Resources are ephemeral. Changes are continuous. APIs provide real-time state.
Gartner reports 70-80% of CMDB projects fail to deliver value. The architecture can't handle modern cloud infrastructure. Agent-based discovery can't capture ephemeral resources. Proprietary data models can't represent cloud-native attributes. Scheduled scans can't provide real-time security posture.
Stop scheduling discovery scans for infrastructure that changes every second. Stop forcing EC2 instances into Server CI templates that miss 40+ critical attributes. Stop paying six figures for systems that can't answer basic questions about your cloud environment.
Start treating infrastructure as data. Call APIs directly using tools like CloudQuery. Store results in SQL databases. Query with standard SQL. Get answers in seconds, not hours.
The technology exists. The architecture works. The implementation can be done in a few hours.
Next Steps:
- Try CloudQuery locally - sync your AWS account to PostgreSQL in 5 minutes
- Browse CloudQuery Hub - explore 500+ integrations
- Explore solutions - security, cost optimization, compliance use cases
- Join CloudQuery Community - connect with engineers solving similar problems
Frequently Asked Questions #
What is a Cloud CMDB?
A Cloud CMDB uses cloud provider APIs to continuously sync infrastructure data into a queryable SQL database. Unlike traditional CMDBs with agents and scheduled discovery, cloud CMDBs call APIs like AWS DescribeInstances to get real-time resource state in seconds. CloudQuery supports 500+ integrations.
What is the difference between traditional CMDB and cloud CMDB?
Traditional CMDBs use agent-based discovery on schedules, force resources into ITIL Configuration Items, and take months to implement. Cloud CMDBs use API-based continuous syncs, store native cloud resource attributes, and take hours to implement. Gartner's 2024 research shows traditional CMDBs have 70-80% failure rates.
Why do traditional CMDBs fail in the cloud?
They rely on agent-based discovery that can't capture ephemeral resources, use ITIL data models that can't represent cloud-native attributes (EC2 instances have 50+ attributes vs 10 in traditional CIs), and scheduled scans can't provide real-time state for infrastructure that changes every second.
Can you query a cloud CMDB with SQL?
Yes. Cloud CMDBs like CloudQuery store infrastructure data in standard SQL databases. Query with standard SQL to find security issues, cost optimization opportunities, or compliance violations. Traditional CMDBs use proprietary query languages. See CloudQuery documentation for examples.
Can I migrate from traditional CMDB to cloud CMDB?
Yes. Run them in parallel initially. Set up CloudQuery to sync your AWS resources, validate data accuracy, then gradually shift queries and workflows. Migration typically takes days to weeks. Join the CloudQuery community to connect with teams who have migrated.
What cloud providers does CloudQuery support?
CloudQuery supports 500+ integrations including AWS, GCP, Azure, Kubernetes, GitHub, Terraform, and hundreds of SaaS platforms. Browse the CloudQuery Hub to see all supported providers.
How often does a cloud CMDB sync data?
Cloud CMDBs sync continuously on configurable schedules. CloudQuery can sync every few minutes, hourly, or on-demand. Each sync queries cloud provider APIs and updates the database with current state. Unlike traditional CMDBs that scan daily and reconcile duplicates, cloud CMDB syncs complete in minutes with no reconciliation needed.