Scheduling CloudQuery Syncs with Google Cloud Run and Cloud Scheduler
Google Cloud Run is a managed compute platform that runs stateless containers invoked via web requests or Pub/Sub events. It provides a serverless way to run CloudQuery syncs on Google Cloud without managing infrastructure.
Cloud Run quotas impose a maximum timeout of 60 minutes per request (configurable up to 60 minutes for HTTP-triggered services). If your sync takes longer than this, consider deploying to a Virtual Machine or using Cloud Composer instead.
Prerequisites
- A Google Cloud project with billing enabled
- Google Cloud CLI (
gcloud) installed and configured - A CloudQuery API key (generate one here)
- A destination database accessible from Cloud Run (e.g., Cloud SQL, BigQuery)
How it works
Cloud Run containers must accept incoming HTTP connections on a configurable port (8080 by default). To run CloudQuery — which is a CLI tool, not a web server — you need a wrapper that:
- Listens for incoming HTTP requests
- Runs
cloudquery syncwhen a request is received - Returns the result
Cloud Scheduler triggers this endpoint on a cron schedule to run syncs automatically.
Step 1: Create the configuration file
Create a config.yml with your source and destination configuration:
kind: source
spec:
name: gcp
path: cloudquery/gcp
registry: cloudquery
version: "v21.4.0"
tables: ["gcp_compute_*"]
destinations: ["postgresql"]
---
kind: destination
spec:
name: postgresql
path: cloudquery/postgresql
registry: cloudquery
version: "v8.14.6"
spec:
connection_string: "${PG_CONNECTION_STRING}"Step 2: Create the Dockerfile
Create a Dockerfile that wraps CloudQuery with a web server. The cloudquery/cloudrun-example repository provides a complete example. The key pattern is:
FROM ghcr.io/cloudquery/cloudquery:latest
COPY config.yml /config.yml
COPY server.sh /server.sh
RUN chmod +x /server.sh
ENTRYPOINT ["/server.sh"]The server.sh script should start a lightweight HTTP server that triggers cloudquery sync /config.yml on incoming requests and returns the exit code as the HTTP response.
See the cloudquery/cloudrun-example repository for a complete, working implementation of the wrapper server.
Step 3: Build and push the container
Create an Artifact Registry repository if you don’t have one:
gcloud artifacts repositories create <YOUR_REPOSITORY> \
--repository-format docker \
--location <YOUR_REGION>Build and push the container image:
# Configure Docker authentication for Artifact Registry
gcloud auth configure-docker <YOUR_REGION>-docker.pkg.dev
# Build the container
docker build -t <YOUR_REGION>-docker.pkg.dev/<YOUR_PROJECT_ID>/<YOUR_REPOSITORY>/cloudquery-sync .
# Push to Artifact Registry
docker push <YOUR_REGION>-docker.pkg.dev/<YOUR_PROJECT_ID>/<YOUR_REPOSITORY>/cloudquery-syncStep 4: Deploy to Cloud Run
gcloud run deploy cloudquery-sync \
--image <YOUR_REGION>-docker.pkg.dev/<YOUR_PROJECT_ID>/<YOUR_REPOSITORY>/cloudquery-sync \
--region <YOUR_REGION> \
--no-allow-unauthenticated \
--timeout 3600 \
--memory 2Gi \
--set-env-vars "CLOUDQUERY_API_KEY=<YOUR_API_KEY>,PG_CONNECTION_STRING=<YOUR_CONNECTION_STRING>"Key flags:
--no-allow-unauthenticated: Restricts access to authenticated callers only (Cloud Scheduler will use a service account)--timeout 3600: Sets the maximum request timeout to 60 minutes--memory 2Gi: Allocate enough memory for the sync process. Adjust based on the number of tables being synced.
Step 5: Schedule with Cloud Scheduler
Create a Cloud Scheduler job that triggers the Cloud Run service on a cron schedule:
gcloud scheduler jobs create http cloudquery-daily-sync \
--location <YOUR_REGION> \
--schedule "0 3 * * *" \
--uri "$(gcloud run services describe cloudquery-sync --region <YOUR_REGION> --format 'value(status.url)')" \
--http-method GET \
--oidc-service-account-email <YOUR_SERVICE_ACCOUNT>@<YOUR_PROJECT_ID>.iam.gserviceaccount.com \
--oidc-token-audience "$(gcloud run services describe cloudquery-sync --region <YOUR_REGION> --format 'value(status.url)')"This schedules a sync every day at 3 a.m.. The --oidc-service-account-email flag ensures the request is authenticated using the specified service account.
Authentication
For GCP source integrations, Cloud Run services automatically receive a service account identity. Grant the service account the required read-only permissions for the resources you want to sync:
# Example: grant Viewer role for GCP source integration
gcloud projects add-iam-policy-binding <YOUR_PROJECT_ID> \
--member "serviceAccount:<YOUR_SERVICE_ACCOUNT>@<YOUR_PROJECT_ID>.iam.gserviceaccount.com" \
--role "roles/viewer"For non-GCP sources (AWS, Azure, etc.), pass the required credentials as environment variables via --set-env-vars or use Secret Manager.
Caveats
- Timeout limit: Cloud Run HTTP-triggered services have a maximum timeout of 60 minutes. Syncs that take longer will be terminated. Use table filtering to reduce sync time, or switch to a VM-based deployment for large estates.
- Stateless execution: Each invocation starts fresh. Integration binaries are downloaded on every run unless you bake them into the Docker image. See the Docker caching guide for strategies to reduce startup time.
- Cold starts: Cloud Run may scale to zero between invocations, adding startup latency. For scheduled syncs this is usually acceptable.
Next Steps
- Google Cloud VM Deployment - Alternative GCP deployment with persistent VMs
- Performance Tuning - Optimize sync speed within Cloud Run timeout limits
- Monitoring - Set up observability for Cloud Run syncs