• Docs
  • Getting Started
  • Getting started with AWS

Getting Started with AWS

Download and Install

You can download the precompiled binary from releases, or using CLI:

curl -L https://versions.cloudquery.io/latest/v1/cloudquery_linux_x86_64 -o cloudquery
chmod a+x cloudquery
curl -L https://versions.cloudquery.io/latest/v1/cloudquery_linux_arm64 -o cloudquery
chmod a+x cloudquery
brew install cloudquery/tap/cloudquery
 
# After initial install you can upgrade the version via:
# brew upgrade cloudquery
curl -L https://versions.cloudquery.io/latest/v1/cloudquery_darwin_x86_64 -o cloudquery
chmod a+x cloudquery
curl -L https://versions.cloudquery.io/latest/v1/cloudquery_darwin_arm64 -o cloudquery
chmod a+x cloudquery
curl -L https://versions.cloudquery.io/latest/v1/cloudquery_windows_x86_64.exe -o cloudquery.exe
Invoke-WebRequest https://versions.cloudquery.io/latest/v1/cloudquery_windows_x86_64.exe -o cloudquery.exe

Running

Init command

After installing CloudQuery, you need to generate a cloudquery.yml file that will describe which cloud provider you want to use and which resources you want CloudQuery to ETL:

cloudquery init aws
 
# cloudquery init aws gcp # This will generate a config containing aws and gcp providers
# cloudquery init --help # Show all possible auto generated configs and flags

All official and approved community plugins are listed here with their respective documentation.

Spawn or connect to a Database

CloudQuery needs a PostgreSQL database (>=10). You can either spawn a local one (usually good for development and local testing) or connect to an existing one.

By default, cloudquery will try to connect to the database postgres on localhost:5432 with username postgres and password pass. After installing docker, you can create such a local postgres instance with:

docker run --name cloudquery_postgres -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres

If you are running postgres at a different location or with different credentials, you need to edit cloudquery.yml - see the Connect to an Existing Database tab.

CloudQuery connects to the postgres database that is defined in the cloudquery.yml's connection section. Edit this section to configure the location and credentials of your postgres database.

cloudquery:
  ...
  ...
 
  connection:
    type: postgres
    username: postgres
    password: pass
    host: localhost
    port: 5432
    database: postgres
    sslmode: disable

Authenticate with AWS

CloudQuery needs to be authenticated with your AWS account in order to fetch information about your cloud setup.

💡

CloudQuery requires only read permissions (we will never make any changes to your cloud setup). Attaching the ReadOnlyAccess policy to the user/role CloudQuery is running as should work for the most part, but you can fine-tune it even more to have read-only access for the specific set of resources that you want CloudQuery to fetch. See also this blog post.

There are multiple ways to authenticate with AWS, and CloudQuery respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:

  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables.
  • credentials and config files in ~/.aws folder (in this respective priority).
  • IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers).

You can find more info about AWS authentication here and here

CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables (AWS_SESSION_TOKEN can be optional for some accounts). For information about obtaining credentials, see the AWS guide.

export AWS_ACCESS_KEY_ID={Your AWS Access Key ID}
export AWS_SECRET_ACCESS_KEY={Your AWS secret access key}
export AWS_SESSION_TOKEN={Your AWS session token}
SET AWS_ACCESS_KEY_ID={Your AWS Access Key ID}
SET AWS_SECRET_ACCESS_KEY={Your AWS secret access key}
SET AWS_SESSION_TOKEN={Your AWS session token}
$Env:AWS_ACCESS_KEY_ID={Your AWS Access Key ID}
$Env:AWS_SECRET_ACCESS_KEY={Your AWS secret access key}
$Env:AWS_SESSION_TOKEN={Your AWS session token}

CloudQuery can use credentials from your credentials and config files in the .aws directory in your home folder. The contents of these files are practically interchangeable, but CloudQuery will prioritize credentials in the credentials file.

For information about obtaining credentials, see the AWS guide.

Here are example contents for a credentials file:

~/.aws/credentials
[default]
aws_access_key_id = <YOUR_ACCESS_KEY_ID>
aws_secret_access_key = <YOUR_SECRET_ACCESS_KEY>

You can also specify credentials for a different profile, and instruct cloudquery to use the credentials from this profile instead of the default one.

For example:

~/.aws/credentials
[myprofile]
aws_access_key_id = <YOUR_ACCESS_KEY_ID>
aws_secret_access_key = <YOUR_SECRET_ACCESS_KEY>

Then, you can either export the AWS_PROFILE environment variable:

export AWS_PROFILE=myprofile

or, configure your desired profile in the local_profile field of your CloudQuery cloudquery.yml:

cloudquery.yml
providers:
  - name: "aws"
    configuration:
      accounts:
        - name: "<YOUR_ID>"
          local_profile: "myprofile"
    ...
  ...

Cloudquery can use IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers). If you configured your AWS compute resources with IAM, cloudquery will use these roles automatically! You don't need to specify additional credentials manually. For more information on configuring IAM, see the AWS docs here and here.

Multi Account/Organization Access

If you have multiple AWS accounts/organizations, you can follow the steps set in the cq-provider-aws README.

Fetch Command

Once cloudquery.yml is generated and you are authenticated with AWS, run the following command to fetch the resources.

cloudquery fetch
# cloudquery fetch --help # Show all possible fetch flags

Exploring and Running Queries

Once CloudQuery fetched the resources, you can explore your cloud infrastructure with SQL!

You can use psql to connect to your postgres instance (of course, you need to change the connection-string to match the location and credentials of your database):

psql "postgres://postgres:pass@localhost:5432/postgres?sslmode=disable"

If you opted for running the PostgreSQL server in a docker as described above, you can also run psql directly from the docker instead of installing it on your machine:

docker exec -it cloudquery_postgres psql -U postgres

Schema and tables for AWS are available here.

A few example queries for AWS:

List ec2_images:

SELECT * FROM aws_ec2_images;

Find all public-facing AWS load balancers:

SELECT * FROM aws_elbv2_load_balancers WHERE scheme = 'internet-facing';

Cloudquery Policies

CloudQuery Policies allow users to write security, governance, cost, and compliance rules with SQL, and run them with psql. You can read more about policies here.

Next Steps

Visit the AWS plugin documentation to read more about it, explore the supported tables and learn about advanced configurations.