Cloud Infrastructure

FinOps

Tutorials

Centralize Multi-Cloud Cost and Asset Data in BigQuery: A Step-by-Step Guide

•

Picture it, you have AWS resources spread across 15 accounts, alongside a handful of GCP projects, and Azure subscriptions for specific workloads. Each cloud provider has its own cost reporting tool — AWS Cost Explorer, GCP Billing, Azure Cost Management — and none of them talk to each other. When your CFO asks "What's our total cloud spend by team?" you need three spreadsheets, two pivot tables, and an hour to compile an answer.

The problem is that cost and asset data lives in silos. Multi-cloud organizations need a single source of truth for financial and operational visibility that they can tap into whenever it's required. The solution is to centralize cost and asset data from AWS, GCP, and Azure into BigQuery using CloudQuery, then query everything with SQL.

What You'll Build #

By the end of this guide, you'll have a unified multi-cloud asset inventory in BigQuery that includes compute, storage, networking, and database resources. You'll integrate cost data from all three major cloud providers and write SQL queries for cross-cloud cost analysis and resource correlation. This setup becomes the foundation for FinOps dashboards and automated reporting.

Prerequisites #

You'll need AWS, GCP, and/or Azure accounts with billing data access (if you don't have them all, simply skip over the relevant section). Make sure you have a GCP project with BigQuery enabled. You'll also need service accounts and credentials for each cloud provider, along with basic SQL knowledge and IAM permissions to read billing data and resources. We'll install the CloudQuery CLI in the first step.

Step 1: Install CloudQuery CLI #

Start by installing the CloudQuery CLI on your system. Check the Quickstart Guide if you need help.

Step 2: Set Up a BigQuery Destination #

Before syncing data, you need to configure BigQuery as your destination. Start by creating a GCP service account specifically for CloudQuery with the appropriate BigQuery permissions.

Create the service account:

gcloud iam service-accounts create cloudquery-sync \
    --display-name="CloudQuery Sync Service Account"

Grant BigQuery permissions to the service account:

gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="serviceAccount:cloudquery-sync@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/bigquery.dataEditor"

gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="serviceAccount:cloudquery-sync@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/bigquery.jobUser"

Generate a JSON key file for authentication:

gcloud iam service-accounts keys create cloudquery-bigquery-key.json \
    --iam-account=cloudquery-sync@YOUR_PROJECT_ID.iam.gserviceaccount.com

Next, create a BigQuery dataset to store your multi-cloud data. You can do this through the BigQuery console or by running SQL:

CREATE SCHEMA multi_cloud_data
OPTIONS (
  location = 'US',
  description = 'Multi-cloud asset and cost data from CloudQuery'
);

Create a CloudQuery destination configuration file called bigquery-destination.yml:

kind: destination
spec:
  name: bigquery
  path: cloudquery/bigquery
  registry: cloudquery
  version: 'v4.7.1'
  write_mode: 'append'
  spec:
    project_id: 'YOUR_PROJECT_ID'
    dataset_id: 'multi_cloud_data'
    time_partitioning: 'hour'

The write_mode: "append" setting preserves historical data, which is critical for tracking costs over time and analyzing deleted resources. Time partitioning uses the _cq_sync_time field automatically, which helps optimize queries by limiting how much data BigQuery scans.

Step 3: Configure AWS Source Plugin #

Setting up AWS requires creating IAM credentials with read-only access to your resources and billing data. Start by creating an IAM policy document that grants the necessary permissions.

Create a file called cloudquery-policy.json:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:Describe*",
        "s3:List*",
        "s3:GetBucketLocation",
        "s3:GetBucketPolicy",
        "rds:Describe*",
        "lambda:List*",
        "lambda:Get*",
        "iam:List*",
        "iam:Get*",
        "ce:GetCostAndUsage",
        "ce:GetCostForecast",
        "cur:DescribeReportDefinitions"
      ],
      "Resource": "*"
    }
  ]
}

Create a dedicated IAM user and attach the policy:

aws iam create-user --user-name cloudquery-sync
aws iam put-user-policy --user-name cloudquery-sync \
    --policy-name CloudQueryReadOnly \
    --policy-document file://cloudquery-policy.json
aws iam create-access-key --user-name cloudquery-sync

The last command outputs an access key ID and secret access key. Save these credentials securely. Configure your AWS CLI with these credentials:

aws configure --profile cloudquery

Enter the access key ID and secret access key when prompted. This creates a named profile that CloudQuery will reference.

Create an AWS source configuration file called aws-source.yml:

kind: source
spec:
  name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: 'v33.3.0'
  destinations: ['bigquery']
  tables:
    - 'aws_ec2_instances'
    - 'aws_ec2_ebs_volumes'
    - 'aws_ec2_ebs_snapshots'
    - 'aws_s3_buckets'
    - 'aws_rds_instances'
    - 'aws_rds_clusters'
    - 'aws_lambda_functions'
    - 'aws_iam_users'
    - 'aws_iam_roles'
    - 'aws_costexplorer_cost_*'
  spec:
    aws_debug: false
    accounts:
      - id: '123456789012'
        local_profile: 'cloudquery'
      - id: '987654321098'
        local_profile: 'cloudquery-account-2'
    regions:
      - 'us-east-1'
      - 'us-west-2'
      - 'eu-west-1'

You can list multiple AWS accounts by adding entries to the accounts array. Each account references an AWS CLI profile. If you have many accounts, you can also use AWS Organizations to automatically discover member accounts.

For detailed billing data beyond what Cost Explorer provides, enable AWS Cost and Usage Report. Navigate to the AWS Billing Console, go to Cost & Usage Reports, and click "Create report". Name it something like cloudquery-cur, enable "Include resource IDs", set time granularity to Hourly, and enable report data integration for Amazon Athena. Choose an S3 bucket for delivery.

Once the Cost and Usage Report is configured, update your AWS configuration to include CUR tables:

tables:
  - 'aws_costexplorer_cost_*'
  - 'aws_cur_*'

Run your first AWS sync:

cloudquery sync aws-source.yml bigquery-destination.yml

CloudQuery connects to the AWS APIs using the credentials you provided, calls methods like DescribeInstances, ListBuckets, and GetCostAndUsage, then streams the results to BigQuery. The sync duration depends on how many resources you have. A typical account with a few hundred resources syncs in under five minutes.

What gets synced? CloudQuery extracts compute resources like EC2 instances and EBS volumes, storage resources like S3 buckets, networking components like VPCs and security groups, databases like RDS and DynamoDB, serverless resources like Lambda functions, IAM entities like users and roles, and cost data from both Cost Explorer and Cost and Usage Reports. Each resource type becomes a table in BigQuery with the prefix aws_.

Step 4: Configure the GCP Source Plugin #

GCP setup involves creating a service account for CloudQuery to read your GCP resources and enabling native billing export to BigQuery.

Create a service account for resource access:

gcloud iam service-accounts create cloudquery-reader \
    --display-name="CloudQuery Reader Service Account"

Grant the viewer role to read resource metadata:

gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="serviceAccount:cloudquery-reader@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/viewer"

Grant billing viewer role for cost data:

gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="serviceAccount:cloudquery-reader@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/billing.viewer"

Generate a JSON key file:

gcloud iam service-accounts keys create cloudquery-gcp-key.json \
    --iam-account=cloudquery-reader@YOUR_PROJECT_ID.iam.gserviceaccount.com

Now enable GCP Billing Export to BigQuery. This is different from CloudQuery — GCP natively exports billing data to BigQuery, so you don't need CloudQuery to sync it. Navigate to Billing in the GCP Console, click Billing Export, and enable "Detailed usage cost" export. Select the multi_cloud_data dataset (the same one CloudQuery uses). GCP will create a table with a name like gcp_billing_export_resource_v1_XXXXXX that contains resource-level billing data.

Create a GCP source configuration file called gcp-source.yml:

kind: source
spec:
  name: gcp
  path: cloudquery/gcp
  registry: cloudquery
  version: 'v20.1.0'
  destinations: ['bigquery']
  tables:
    - 'gcp_compute_instances'
    - 'gcp_compute_disks'
    - 'gcp_compute_snapshots'
    - 'gcp_storage_buckets'
    - 'gcp_cloudsql_instances'
    - 'gcp_run_services'
    - 'gcp_functions_functions'
    - 'gcp_iam_service_accounts'
  spec:
    project_ids:
      - 'project-id-1'
      - 'project-id-2'
      - 'project-id-3'
    service_account_key_json: './cloudquery-gcp-key.json'

List all the GCP project IDs you want to sync. CloudQuery will query each project in parallel to extract resource metadata.

Run the GCP sync:

cloudquery sync gcp-source.yml bigquery-destination.yml

CloudQuery extracts metadata from all specified GCP projects, syncing Compute Engine instances and disks, Cloud Storage buckets, Cloud SQL databases, Cloud Run services, Cloud Functions, and IAM service accounts. GCP's native billing data appears in the billing export table in the same dataset, ready to join with CloudQuery's resource tables.

Step 5: Configure the Azure Source Plugin #

For Azure, you need to create a service principal with read access to your subscriptions and cost data.

Create a service principal:

az login

az ad sp create-for-rbac --name "cloudquery-reader" \
    --role "Reader" \
    --scopes /subscriptions/YOUR_SUBSCRIPTION_ID

This command outputs JSON with appId (your client ID), password (your client secret), and tenant (your tenant ID). Save these values.

Grant Cost Management Reader role for billing data:

az role assignment create \
    --assignee YOUR_APP_ID_FROM_ABOVE \
    --role "Cost Management Reader" \
    --scope /subscriptions/YOUR_SUBSCRIPTION_ID

Create an Azure source configuration file called azure-source.yml:

kind: source
spec:
  name: azure
  path: cloudquery/azure
  registry: cloudquery
  version: 'v18.0.1'
  destinations: ['bigquery']
  tables:
    - 'azure_compute_virtual_machines'
    - 'azure_compute_disks'
    - 'azure_storage_accounts'
    - 'azure_storage_blob_containers'
    - 'azure_network_virtual_networks'
    - 'azure_network_security_groups'
    - 'azure_sql_servers'
    - 'azure_sql_databases'
    - 'azure_costmanagement_costs'
  spec:
    subscriptions:
      - 'subscription-id-1'
      - 'subscription-id-2'
    client_id: 'YOUR_CLIENT_ID'
    client_secret: 'YOUR_CLIENT_SECRET'
    tenant_id: 'YOUR_TENANT_ID'

Run the Azure sync:

cloudquery sync azure-source.yml bigquery-destination.yml

CloudQuery syncs compute resources like virtual machines and managed disks, storage accounts and blob containers, networking components like virtual networks and network security groups, SQL databases, IAM entities like users and role assignments, and cost data from Azure Cost Management. Azure resources get the azure_ table prefix in BigQuery.

Step 6: Create a Combined Configuration File #

Instead of running three separate sync commands, you can combine all sources into a single configuration file. This makes syncing more efficient and keeps all your configuration data in one place.

Create a file called multi-cloud-sync.yml:

kind: source
spec:
  name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: 'v33.3.0'
  destinations: ['bigquery']
  tables: ['aws_*']
  spec:
    accounts:
      - id: '123456789012'
        local_profile: 'cloudquery'

---
kind: source
spec:
  name: gcp
  path: cloudquery/gcp
  registry: cloudquery
  version: 'v20.1.0'
  destinations: ['bigquery']
  tables: ['gcp_*']
  spec:
    project_ids: ['project-1', 'project-2']
    service_account_key_json: './cloudquery-gcp-key.json'

---
kind: source
spec:
  name: azure
  path: cloudquery/azure
  registry: cloudquery
  version: 'v18.0.1'
  destinations: ['bigquery']
  tables: ['azure_*']
  spec:
    subscriptions: ['sub-1', 'sub-2']
    client_id: '${AZURE_CLIENT_ID}'
    client_secret: '${AZURE_CLIENT_SECRET}'
    tenant_id: '${AZURE_TENANT_ID}'

---
kind: destination
spec:
  name: bigquery
  path: cloudquery/bigquery
  registry: cloudquery
  version: 'v4.7.1'
  write_mode: 'append'
  spec:
    project_id: 'YOUR_PROJECT_ID'
    dataset_id: 'multi_cloud_data'
    time_partitioning: 'hour'

Notice the ${AZURE_CLIENT_SECRET} syntax. This tells CloudQuery to read the value from an environment variable, which keeps credentials out of your configuration files. Set the environment variables before running the sync:

export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret"
export AZURE_TENANT_ID="your-tenant-id"

Run the unified sync:

cloudquery sync multi-cloud-sync.yml

CloudQuery runs all sources in parallel, populating your BigQuery dataset with data from AWS, GCP, and Azure simultaneously. You'll see progress bars for each source as it syncs resources.

Step 7: Query Your Multi-Cloud Data #

Once the sync completes, you can start querying your multi-cloud data in BigQuery. Open the BigQuery console and navigate to your multi_cloud_data dataset. You should see dozens of tables with prefixes like aws_, gcp_, and azure_.

Verify data synced correctly:

SELECT table_name, row_count, size_bytes
FROM `YOUR_PROJECT_ID.multi_cloud_data.__TABLES__`
ORDER BY table_name;

This shows all tables in your dataset along with how many rows and bytes each contains. You should see tables like aws_ec2_instances, gcp_compute_instances, and azure_compute_virtual_machines.

Count resources per cloud provider:

SELECT
  CASE
    WHEN table_name LIKE 'aws_%' THEN 'AWS'
    WHEN table_name LIKE 'gcp_%' THEN 'GCP'
    WHEN table_name LIKE 'azure_%' THEN 'Azure'
  END as cloud_provider,
  COUNT(*) as table_count
FROM `YOUR_PROJECT_ID.multi_cloud_data.__TABLES__`
GROUP BY cloud_provider;

Now let's write queries that span all three clouds. Start with a unified view of compute resources:

WITH aws_compute AS (
  SELECT
    'AWS' as cloud_provider,
    instance_id as resource_id,
    instance_type as resource_type,
    region,
    JSON_VALUE(tags, '$.Name') as name,
    JSON_VALUE(tags, '$.Environment') as environment,
    state_name as state
  FROM `YOUR_PROJECT_ID.multi_cloud_data.aws_ec2_instances`
  WHERE state_name = 'running'
),
gcp_compute AS (
  SELECT
    'GCP' as cloud_provider,
    CAST(id AS STRING) as resource_id,
    machine_type as resource_type,
    zone as region,
    name,
    (SELECT value FROM UNNEST(labels) WHERE key = 'environment' LIMIT 1) as environment,
    status as state
  FROM `YOUR_PROJECT_ID.multi_cloud_data.gcp_compute_instances`
  WHERE status = 'RUNNING'
),
azure_compute AS (
  SELECT
    'Azure' as cloud_provider,
    id as resource_id,
    vm_size as resource_type,
    location as region,
    name,
    JSON_VALUE(tags, '$.Environment') as environment,
    power_state as state
  FROM `YOUR_PROJECT_ID.multi_cloud_data.azure_compute_virtual_machines`
  WHERE power_state = 'running'
)

SELECT * FROM aws_compute
UNION ALL
SELECT * FROM gcp_compute
UNION ALL
SELECT * FROM azure_compute
ORDER BY cloud_provider, name;

This query creates a common table expression (CTE) for each cloud that normalizes the fields into a consistent structure. AWS uses instance_id, GCP uses id, and Azure uses id, so we alias them all to resource_id. We extract environment tags from each cloud's native tagging format, then combine everything with UNION ALL.

Find unattached storage volumes across clouds:

SELECT
  'AWS' as cloud,
  volume_id as resource_id,
  size as size_gb,
  volume_type,
  availability_zone as location,
  create_time,
  JSON_VALUE(tags, '$.Name') as name
FROM `YOUR_PROJECT_ID.multi_cloud_data.aws_ec2_ebs_volumes`
WHERE state = 'available'

UNION ALL

SELECT
  'GCP' as cloud,
  name as resource_id,
  size_gb,
  type as volume_type,
  zone as location,
  creation_timestamp as create_time,
  name
FROM `YOUR_PROJECT_ID.multi_cloud_data.gcp_compute_disks`
WHERE ARRAY_LENGTH(users) = 0

UNION ALL

SELECT
  'Azure' as cloud,
  name as resource_id,
  disk_size_gb as size_gb,
  sku_name as volume_type,
  location,
  time_created as create_time,
  name
FROM `YOUR_PROJECT_ID.multi_cloud_data.azure_compute_disks`
WHERE managed_by IS NULL

ORDER BY cloud, size_gb DESC;

Unattached volumes represent wasted spending. For AWS, we check where state = 'available' which means the volume exists but isn't attached to an instance. For GCP, we check where the users array is empty, meaning no VMs are using the disk. For Azure, we check where managed_by is null, meaning no VM owns the disk. Ordering by size shows you the biggest opportunities first.

Next, try writing a cross-cloud cost analysis query:

WITH aws_costs AS (
  SELECT
    'AWS' as cloud,
    JSON_VALUE(resource_tags, '$.Team') as team,
    JSON_VALUE(resource_tags, '$.Environment') as environment,
    SUM(line_item_blended_cost) as total_cost
  FROM `YOUR_PROJECT_ID.multi_cloud_data.aws_cur_line_items`
  WHERE DATE(line_item_usage_start_date) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
  GROUP BY team, environment
),
gcp_costs AS (
  SELECT
    'GCP' as cloud,
    (SELECT value FROM UNNEST(labels) WHERE key = 'team' LIMIT 1) as team,
    (SELECT value FROM UNNEST(labels) WHERE key = 'environment' LIMIT 1) as environment,
    SUM(cost) as total_cost
  FROM `YOUR_PROJECT_ID.multi_cloud_data.gcp_billing_export_resource_v1_*`
  WHERE DATE(_PARTITIONTIME) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
  GROUP BY team, environment
)

SELECT
  cloud,
  COALESCE(team, 'untagged') as team,
  COALESCE(environment, 'untagged') as environment,
  ROUND(total_cost, 2) as total_cost_usd
FROM aws_costs

UNION ALL

SELECT
  cloud,
  COALESCE(team, 'untagged') as team,
  COALESCE(environment, 'untagged') as environment,
  ROUND(total_cost, 2) as total_cost_usd
FROM gcp_costs

ORDER BY total_cost_usd DESC;

This query aggregates costs by team and environment tags across AWS and GCP. The COALESCE function handles resources without tags by labeling them "untagged", which helps identify resources that need better tagging for chargeback purposes. Add an Azure CTE when you have Azure Cost Management data available.

Find optimization opportunities like gp2 volumes that should migrate to gp3:

SELECT
  v.volume_id,
  v.size as size_gb,
  v.volume_type,
  v.availability_zone,
  v.create_time,
  JSON_VALUE(v.tags, '$.Name') as name,
  ROUND(v.size * 0.10, 2) as estimated_monthly_cost_gp2,
  ROUND(v.size * 0.08, 2) as estimated_monthly_cost_gp3,
  ROUND((v.size * 0.10) - (v.size * 0.08), 2) as monthly_savings
FROM `YOUR_PROJECT_ID.multi_cloud_data.aws_ec2_ebs_volumes` v
WHERE v.volume_type = 'gp2'
  AND v.state = 'in-use'
ORDER BY monthly_savings DESC;

This uses the pricing difference between gp2 ($0.10 per GB-month) and gp3 ($0.08 per GB-month) to calculate potential savings. You can write similar queries to identify other optimization opportunities like idle EC2 instances, oversize RDS databases, or old EBS snapshots.

Troubleshooting Common Issues #

AWS sync fails with "AccessDenied" errors #

Check that your IAM permissions are correctly configured on the CloudQuery user. Run aws configure list to verify that AWS credentials are set correctly. Make sure Cost Explorer API is enabled in AWS billing preferences — some AWS accounts have it disabled by default. If you're syncing Cost and Usage Reports, verify that the IAM user has s3:GetObject permissions on the CUR S3 bucket.

GCP billing data not appearing in BigQuery #

Billing export takes up to 24 hours to start populating data after you enable it. Check your billing export settings in the GCP Console to confirm it's enabled and pointing to the correct dataset. Verify that the service account has the roles/billing.viewer permission. If you have multiple billing accounts, make sure you enabled export on the correct one.

Azure sync times out #

Azure API rate limits are lower than AWS and GCP, which can cause timeouts when syncing large subscriptions. Reduce concurrency in your configuration by adding max_goroutines: 10 to the Azure spec section. You can also split subscriptions across multiple sync jobs to distribute the load. Run syncs during off-peak hours when API rate limits are less likely to be hit.

BigQuery tables not created #

Check that your service account has both bigquery.dataEditor and bigquery.jobUser roles. Verify that the dataset exists and is in the correct region — CloudQuery can't create tables in a dataset that doesn't exist. Run CloudQuery with debug logging enabled using cloudquery sync --log-level debug config.yml to see detailed error messages. Look for authentication errors or permission denied messages.

High BigQuery costs from queries #

Make sure you're using partitioning and clustering on frequently queried fields. Always add WHERE DATE(_PARTITIONTIME) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) clauses to limit how much data your queries scan. Select specific columns instead of using SELECT * — BigQuery charges by data scanned, and selecting fewer columns reduces costs. Use the BigQuery query validator before running expensive queries to check estimated costs. For queries you run frequently, materialize the results into a table instead of recomputing from raw data.

Sync takes longer than expected #

CloudQuery syncs are constrained by cloud provider API rate limits. Sync performance varies based on the number of resources, API throttling policies, and network conditions. If syncs are taking multiple hours, consider filtering to sync only the tables you need instead of using wildcards like aws_*. You can also run multiple CloudQuery instances in parallel, each syncing different accounts or regions.

Moving Forward: From Data to Insights #

You now have a unified multi-cloud asset and cost database in BigQuery. This foundation enables data-driven cloud financial management with a single source of truth for cloud spending across AWS, GCP, and Azure. You have SQL-based access for custom cost analysis and optimization queries, infrastructure metadata correlated with costs for detailed analysis, and a scalable data warehouse for long-term trend analysis and forecasting.

The next step is operationalizing this data. Build FinOps dashboards in Looker or Data Studio to visualize spending trends across clouds. Set up automated alerts for cost anomalies using BigQuery scheduled queries that detect unusual spending patterns. Create chargeback reports for internal teams that allocate costs based on tags and labels. CloudQuery's open architecture means you control the data pipeline end-to-end, from extraction through storage to visualization.

The queries in this guide are starting points. You can extend them to analyze reserved instance coverage, track savings plan utilization, identify resources without proper tags, or correlate costs with application performance metrics. Join cost data with CloudQuery's security posture tables to understand the financial impact of security findings. Combine billing data with change management systems to see how deployments affect spending.

Ready to try it? Install CloudQuery and sync your first cloud to BigQuery in under 10 minutes. Join the CloudQuery community for implementation guidance from other users, or contact our team for enterprise support and custom integrations.

Frequently Asked Questions #

What is multi-cloud cost management? #

Multi-cloud cost management is the practice of tracking, analyzing, and optimizing spending across multiple cloud providers like AWS, GCP, and Azure from a single interface. Traditional approaches use separate tools per provider, which leads to data silos. Centralizing cost data in BigQuery with CloudQuery provides a unified SQL interface for cross-cloud analysis, custom dashboards, and automated cost optimization workflows. You can answer questions like "What's our total cloud spend by team across all providers?" without compiling data from three different tools.

How long does it take to set up multi-cloud cost tracking with CloudQuery? #

Initial setup takes about one to two hours total. You'll spend around 30 minutes installing the CloudQuery CLI and configuring credentials for each cloud provider. Writing configuration files for all three clouds takes another 30 to 60 minutes. The first sync runs in 15 to 30 minutes depending on your environment size. For a typical setup with 500 AWS resources, 200 GCP resources, and 100 Azure resources, expect the first sync to complete in under an hour. Subsequent syncs run faster because CloudQuery only processes changes when using incremental sync mode.

What cloud providers does CloudQuery support? #

CloudQuery supports over 500 integrations across cloud providers, SaaS platforms, and databases. Major cloud providers include AWS with support for over 1,000 services, Google Cloud Platform, Microsoft Azure, Oracle Cloud, Alibaba Cloud, and IBM Cloud. You can also sync data from Kubernetes, GitHub, Datadog, PagerDuty, Stripe, Salesforce, and hundreds of other platforms. The full list is available on CloudQuery Hub. Each integration is maintained by CloudQuery with regular updates for new APIs and services.

How often does CloudQuery sync data? #

Sync frequency depends entirely on your configuration. You can set up near real-time syncs that run hourly, standard syncs that run daily (most common for cost data), or batch syncs that run weekly for historical analysis. CloudQuery supports full sync mode that replaces all data each run and incremental sync mode that appends new data while preserving historical records. We recommend daily syncs for cost data since billing information updates daily, and hourly syncs for asset inventory when you need up-to-date security and compliance visibility.

Can you query costs and assets together with CloudQuery? #

Yes, absolutely. CloudQuery syncs both cost data like AWS Cost and Usage Report, GCP Billing Export, and Azure Cost Management data along with asset metadata like EC2 instances, S3 buckets, and virtual machines into the same database. You can write SQL joins between cost tables and resource tables to answer questions like "Which unattached EBS volumes are costing us money?" or "What's the total cost of production instances across AWS and GCP?" This correlation capability doesn't exist when using native cloud tools separately. You can join billing line items with resource attributes like tags, regions, and configurations to understand exactly what's driving costs.

What BI tools work with BigQuery and CloudQuery? #

Any BigQuery-compatible tool works with your CloudQuery data. Popular options include Looker, Tableau, Google Data Studio, Metabase, Apache Superset, Power BI using ODBC connectors, Grafana, and Redash. CloudQuery also offers pre-built Grafana dashboards for common use cases like security posture monitoring and cost optimization. Since your data stays in BigQuery, you control access permissions and can use any visualization layer your team prefers. You can also export data to Google Sheets for quick ad hoc analysis or build custom applications that query BigQuery directly.

How do I handle cost data for deleted resources? #

Use CloudQuery's append write mode instead of overwrite in your configuration. This preserves historical records even after resources are terminated in your cloud accounts. For example, if you delete an EC2 instance, its cost history remains in BigQuery for trend analysis and chargebacks to teams. Combine this with BigQuery's time-partitioning feature to query historical data efficiently without scanning entire tables. This is particularly useful for monthly cost reports and year-over-year comparisons. You can also add a _cq_sync_time column to track when each record was synced, which helps identify when resources were deleted.

What is the free tier limit for CloudQuery? #

CloudQuery offers a free tier for up to 1 million rows synced per month using any of the pre-built plugins from CloudQuery Hub. For most small to medium environments, this is sufficient to get started with the AWS, GCP, and Azure integrations covered in this guide. Premium plans start at $300 per month for 1 billion rows synced and include priority support. Enterprise plans offer custom pricing for unlimited scale along with dedicated support and custom integrations. While CloudQuery is open source and you can build custom plugins yourself, using the official Hub integrations gives you production-ready connectors with ongoing updates and support.