How to Optimize Resource Management with AWS ElastiCache Cluster Monitoring
Introduction #
- Resource Optimization: Monitoring node counts and types helps you identify underutilized or over provisioned clusters, allowing you to adjust resources and optimize costs.
- Performance Management: Changes in node count and types can impact the performance of your applications. Tracking these changes helps ensure your clusters are properly scaled to meet demand.
- Historical Analysis: By tracking node count and type changes over time, you can analyze trends and predict future resource requirements.
aws_elasticache_clusters
table is included in your AWS source configuration. Set the write_mode
to append
in your PostgreSQL destination configuration to track historical states and changes in the number of nodes and types over time.How to Track ElasticCache Node Count Changes #
-- CTE: timeseriesdata
with timeseriesdata as (
select
_cq_sync_time as time,
replication_group_id,
region,
account_id,
count(*) as node_count
from
aws_elasticache_clusters
where
replication_group_id != '' or replication_group_id is not null
group by
time,
account_id,
region,
replication_group_id
),
-- CTE: sum_changes
sum_changes as (
select
sum(timeseriesdata.node_count) as count,
account_id,
region,
replication_group_id,
time
from
timeseriesdata
group by
time,
account_id,
region,
replication_group_id
),
-- CTE: identify_changes
identify_changes as (
select
replication_group_id
from
sum_changes
group by account_id, region, replication_group_id
having
count(distinct count) > 1
)
-- Final Selection
select
timeseriesdata.*
from
timeseriesdata
where
timeseriesdata.replication_group_id in (select distinct replication_group_id from identify_changes)
order by
time;
Steps #
- CTE:
timeseriesdata
- This CTE aggregates the number of nodes in each cluster over time.
- It counts the number of nodes (
node_count
) in each cluster and groups the data bytime
,account_id
,region
, andreplication_group_id
. - The where clause ensures that only clusters with a valid
replication_group_id
are considered, ensuring that we focus on clusters that are part of a replication group.
- CTE:
sum_changes
- This CTE calculates the total node count for each cluster at each point in time.
- It sums the
node_count
from thetimeseriesdata
CTE and groups the results bytime
,account_id
,region
, andreplication_group_id
.
- CTE:
identify_changes
- This CTE identifies clusters that have experienced changes in node count.
- It groups the data by
account_id
,region
, andreplication_group_id
. - The having clause ensures that only clusters with more than one distinct node count over time are selected, indicating a change in node count.
- Final Selection
- The final select statement retrieves all columns from the
timeseriesdata
CTE. - It filters the results to include only the clusters identified in the
identify_changes
CTE. - The order by clause sorts the results by time to provide a chronological view of the node count changes.
Example Result #
replication_group_id
at a given time. This allows you to see how the node count has varied over time for each replication group, helping you identify trends and make informed decisions about resource allocation.How to Track New ElasticCache Node Types Each Week #
write_mode: append
to save the historical states.-- CTE: cluster_weeks
with cluster_weeks as (
select
date_trunc('week', _cq_sync_time) as week,
cache_node_type,
region,
account_id
from
aws_elasticache_clusters
where
cache_node_type != ''
group by
week,
cache_node_type,
region,
account_id
),
-- CTE: new_types
new_types as (
select
current_week.week as current_week,
current_week.cache_node_type,
current_week.region,
current_week.account_id
from
cluster_weeks current_week
left join
cluster_weeks previous_week
on
previous_week.week = current_week.week - interval '1 week'
and previous_week.cache_node_type = current_week.cache_node_type
and previous_week.region = current_week.region
and previous_week.account_id = current_week.account_id
where
previous_week.cache_node_type is null
)
-- Final Selection
select
current_week,
cache_node_type,
region,
account_id
from
new_types
order by
current_week, cache_node_type, region, account_id;
- CTE:
cluster_weeks
- This CTE aggregates the node types for each cluster by week.
- It selects the synchronization time (
_cq_sync_time
) truncated to the start of - the week (week), along withcache_node_type
,region
, andaccount_id
. - It
groups
the data byweek
,cache_node_type
,region
, andaccount_id
.
- CTE:
new_types
- This CTE identifies the new node types introduced each week by comparing the current week to the previous week.
- It selects data from
cluster_weeks
for the current week (current_week
) and performs a left join withcluster_weeks
for the previous week (previous_week) based on week,cache_node_type
, region, andaccount_id
. - The where clause ensures that only node types not present in the previous week are included.
- Final Selection
- The final select statement retrieves the
current_week
,cache_node_type
,region
, andaccount_id
from thenew_types
CTE. - The order by clause sorts the results by
current_week
,cache_node_type
,region
, andaccount_id
.
Example Result #
How to Track New ElastiCache Node Types by Day #
WITH cluster_days AS (
SELECT
date_trunc('day', _cq_sync_time) AS day,
cache_node_type,
region,
account_id
FROM
aws_elasticache_clusters
WHERE
cache_node_type != '' -- Filter out empty cluster types
GROUP BY
day,
cache_node_type,
region,
account_id
),
new_types AS (
SELECT
current_day.day AS current_day,
current_day.cache_node_type,
current_day.region,
current_day.account_id
FROM
cluster_days current_day
LEFT JOIN
cluster_days previous_day
ON
previous_day.day = current_day.day - interval '1 day'
AND previous_day.cache_node_type = current_day.cache_node_type
AND previous_day.region = current_day.region
AND previous_day.account_id = current_day.account_id
WHERE
previous_day.cache_node_type IS NULL -- Identify new types
)
SELECT
current_day,
cache_node_type,
region,
account_id
FROM
new_types
ORDER BY
current_day, cache_node_type, region, account_id;
Summary #

Written by Ron Shemesh
Ron is a Senior Software Engineer at CloudQuery and has a background in data science and now works as a senior software engineer at CloudQuery. He has experience working in Python, SQL, React, Java and R. At CloudQuery, Ron has worked on our range of integrations and several projects foundational to platform performance. He loves taking on a challenge and using it to improve his skills.