announcement

AWS Cost Transformation is Now Available!

Kevin Rheinheimer

Kevin Rheinheimer Feb 29, 2024

In need of a one-stop solutiom to tracking your AWS costs and usage?

Finding valuable and actionable data from the many available AWS sources can be a daunting task. There are disparate sources within AWS that, while all being valuable on their own, make it difficult to find and house your usage, cost, and utilization data.
That is why we have developed a new AWS Cost Policy. This policy contains transformations that leverage both CloudQuery provided tables and your readily available Cost and Usage Report from AWS. When generating this report, please ensure that include resource IDs is selected for compatibility.
We have simplified this process utilizing CloudQuery's AWS source, File source, and Postgres destination plugins.
With this new set of transformations, you will be able to ensure that your usage and any potential costs incurred are optimized to your needs.

What are some use cases for this policy?

  • Monitoring costs - quickly get a clear snapshot of any cost or expenditure incurred on different AWS resources, this will allow you to make decisions on how to properly optimize and configure resources exactly to your needs
  • Monitoring usage - ensure that the services you have activated in your AWS accounts are being utilized properly
  • Compute optimization - gain actionable recommendations at a glance from the AWS cost optimizer data made available in your CloudQuery tables
  • Usage optimization - gain actionable insights at a glance from the tables created by the policy
 

Let's walk through a few use case examples together

Set up

In order for this policy to be run successfuly, you must sync your AWS metadata with the CloudQuery AWS source plugin, Cost and Usage data with the file plugin, and PostgreSQL destination, as mentioned previously.
Download the CloudQuery CLI and use the configs below to sync the data required for this policy. Don't forget to set the specific path to the files and a connection string to your destination database. Note: The files_dir configuration in the below file source should be a directory on your machine or instance.
kind: source
spec:
  name: file
  version: v1.2.1
  destinations: ["postgresql"]
  path: cloudquery/file
  registry: cloudquery
  tables: ["*"]
  spec:
    files_dir: "<path-to-your-aws-cost-and-usage-reports>"
---
kind: source
spec:
  name: aws
  version: v24.3.3
  destinations: ["postgresql"]
  path: cloudquery/aws
  tables: ["*"]
---
kind: destination
spec:
  name: postgresql
  path: cloudquery/postgresql
  version: "v7.3.6"
  spec:
    connection_string: postgresql://postgres:pass@localhost:5432/postgres
Run the sync with cloudquery sync config.yaml command.
Now that we have everything configured and synced let us run the transformation from our cost policy directory using the command:
dbt run --vars '{"cost_usage_table": "<your table name here>"}'
Once that command has run successfully, you should see views starting with aws_cost__ or aws_usage in your database. For the list of all views and their documentation, see the documentation.

Example queries

Top 5 highest spending accounts
Let's say you have a multi-tenant set-up and you want to see which accounts are incurring the most cost. The view aws_cost__by_account will house this information. Here is how you can get the top 5 accounts that have incurred the most cost:
select * from aws_cost__by_account
order by cost desc
limit 5;
This query will show you the identifier for each of the top spending accounts alongside how much those accounts have spent. Check the image below to view a sample result from this query:
aws-cost-by-account
Finding under-utilized resources
Another common stumbling block to gaining clear insight to your AWS costs is finding under-utilized resources in your AWS accounts that may be incurring costs:
The most used service in all of AWS is its EC2 offerings. However, it can be difficult to keep track of all EC2 instances to ensure that they are all optimized and performing as expected. With this new policy, you will have access to utilization data so you can be confident that your instances are utilized properly.
Here is a ready-made query to unearth potentially under-utilized instances:
SELECT *
FROM aws_cost__by_under_utilized_resources
WHERE service = 'EC2'
ORDER BY cost DESC
limit 10;
aws-cost-by-under-utilized
This query is slightly more granular than our previous example. You will receive some specific information about your EC2 instances:
  • service name - defaulted to 'EC2' for this example
  • resource id - this is the identifier for each specific EC2 instance you have
  • metric - this is the actual statistic being tracked, for EC2s that statistic is CPU Utilization
  • mean usage - this is the average CPU utilization for each EC2 instance
  • max usage - this is the highest CPU utilization for each EC2 instance
  • cost - this is the total cost incurred by the EC2 instance
Visualizing your data
Along with the transformations, we have included a suite of visualizations using Grafana that leverage your newly transformed AWS costs. You can find documentation on how to set up your own visualization here. For now, let's check out some examples below: General cost dashboard general-cost-dashboard
Cost per service cost-per-service
Cost trends cost_trend_over_time
 

Questions? Feedback?

We are always eager to hear feedback. Let us know what you think. File feature-requests, bugs, and issues at github.com/cloudquery/cloudquery or join our discord.
Subscribe to product updates

Be the first to know about new features.