aws
solutions
transformations
How to Optimize Resource Management with AWS ElastiCache Cluster Monitoring
Ron Shemesh •
Introduction #
Managing your cloud infrastructure efficiently is crucial for ensuring optimal performance and cost-effectiveness. AWS ElastiCache clusters, which provide in-memory data caching services, are a key component of many cloud architectures. Tracking changes in your ElastiCache node count and types within these clusters is essential for several reasons:
- Resource Optimization: Monitoring node counts and types helps you identify underutilized or overprovisioned clusters, allowing you to adjust resources and optimize costs.
- Performance Management: Changes in node count and types can impact the performance of your applications. Tracking these changes helps ensure your clusters are properly scaled to meet demand.
- Historical Analysis: By tracking node count and type changes over time, you can analyze trends and predict future resource requirements.
With CloudQuery, you can easily export your AWS data to a PostgreSQL database (or any other data destination), enabling comprehensive analysis and visualization. Ensure that the
aws_elasticache_clusters
table is included in your AWS source configuration. Set the write_mode
to append
in your PostgreSQL destination configuration to track historical states and changes in the number of nodes and types over time.How to Track ElasticCache Node Count Changes #
The SQL query below tracks changes in ElasticCachenode count over time for clusters that are part of a replication group, excluding clusters with constant node counts. By focusing on clusters with varying node counts, it helps identify those that require attention due to resource adjustments. The query uses Common Table Expressions (CTEs) to organize the data and identify clusters with significant changes.
-- CTE: timeseriesdata
with timeseriesdata as (
select
_cq_sync_time as time,
replication_group_id,
region,
account_id,
count(*) as node_count
from
aws_elasticache_clusters
where
replication_group_id != '' or replication_group_id is not null
group by
time,
account_id,
region,
replication_group_id
),
-- CTE: sum_changes
sum_changes as (
select
sum(timeseriesdata.node_count) as count,
account_id,
region,
replication_group_id,
time
from
timeseriesdata
group by
time,
account_id,
region,
replication_group_id
),
-- CTE: identify_changes
identify_changes as (
select
replication_group_id
from
sum_changes
group by account_id, region, replication_group_id
having
count(distinct count) > 1
)
-- Final Selection
select
timeseriesdata.*
from
timeseriesdata
where
timeseriesdata.replication_group_id in (select distinct replication_group_id from identify_changes)
order by
time;
This query identifies AWS ElastiCache clusters with node count changes over time, excluding those with constant counts. By tracking these changes, users can optimize resources, manage costs, and ensure their clusters are properly scaled for performance.
Steps:
- CTE: timeseriesdata
- This CTE aggregates the number of nodes in each cluster over time.
- It counts the number of nodes (node_count) in each cluster and groups the data by time, account_id, region, and replication_group_id.
- The where clause ensures that only clusters with a valid replication_group_id are considered, ensuring that we focus on clusters that are part of a replication group.
- CTE: sum_changes
- This CTE calculates the total node count for each cluster at each point in time.
- It sums the node_count from the timeseriesdata CTE and groups the results by time, account_id, region, and replication_group_id.
- CTE: identify_changes
- This CTE identifies clusters that have experienced changes in node count.
- It groups the data by account_id, region, and replication_group_id.
- The having clause ensures that only clusters with more than one distinct node count over time are selected, indicating a change in node count.
- Final Selection
- The final select statement retrieves all columns from the timeseriesdata CTE.
- It filters the results to include only the clusters identified in the identify_changes CTE.
- The order by clause sorts the results by time to provide a chronological view of the node count changes.
Example Result #
time | replication_group_id | region | account_id | node_count |
---|---|---|---|---|
2024-07-01 00:00:00 | rep_group_3 | us-east-1 | account_1 | 2 |
2024-07-01 00:00:00 | rep_group_2 | us-east-1 | account_1 | 1 |
2024-07-01 00:00:00 | rep_group_1 | us-east-1 | account_1 | 2 |
2024-07-01 01:00:00 | rep_group_2 | us-east-1 | account_1 | 2 |
2024-07-01 01:00:00 | rep_group_1 | us-east-1 | account_1 | 2 |
2024-07-01 02:00:00 | rep_group_1 | us-east-1 | account_1 | 1 |
2024-07-02 00:00:00 | rep_group_3 | us-east-1 | account_1 | 1 |
2024-07-02 00:00:00 | rep_group_1 | us-east-1 | account_1 | 1 |
2024-07-02 00:00:00 | rep_group_2 | us-east-1 | account_1 | 1 |
2024-07-02 01:00:00 | rep_group_1 | us-east-1 | account_1 | 1 |
2024-07-02 01:00:00 | rep_group_3 | us-east-1 | account_1 | 1 |
2024-07-02 01:00:00 | rep_group_2 | us-east-1 | account_1 | 1 |
2024-07-02 02:00:00 | rep_group_2 | us-east-1 | account_1 | 1 |
2024-07-02 02:00:00 | rep_group_1 | us-east-1 | account_1 | 1 |
The table above shows the historical data for clusters with changes in node count. Each row represents a record of the node count for a specific
replication_group_id
at a given time. This allows you to see how the node count has varied over time for each replication group, helping you identify trends and make informed decisions about resource allocation.How to Track New ElasticCache Node Types Each Week #
Another useful query tracks changes in the types of nodes each week, showing only the new types of nodes introduced each week compared to the previous week. Again, use
write_mode: append
to save the historical states.-- CTE: cluster_weeks
with cluster_weeks as (
select
date_trunc('week', _cq_sync_time) as week,
cache_node_type,
region,
account_id
from
aws_elasticache_clusters
where
cache_node_type != ''
group by
week,
cache_node_type,
region,
account_id
),
-- CTE: new_types
new_types as (
select
current_week.week as current_week,
current_week.cache_node_type,
current_week.region,
current_week.account_id
from
cluster_weeks current_week
left join
cluster_weeks previous_week
on
previous_week.week = current_week.week - interval '1 week'
and previous_week.cache_node_type = current_week.cache_node_type
and previous_week.region = current_week.region
and previous_week.account_id = current_week.account_id
where
previous_week.cache_node_type is null
)
-- Final Selection
select
current_week,
cache_node_type,
region,
account_id
from
new_types
order by
current_week, cache_node_type, region, account_id;
This query identifies new node types introduced each week for AWS ElastiCache clusters. By tracking these changes, users can identify newly introduced node types and manage resources more effectively.
- CTE: cluster_weeks
- This CTE aggregates the node types for each cluster by week.
- It selects the synchronization time (_cq_sync_time) truncated to the start of - the week (week), along with cache_node_type, region, and account_id.
- It groups the data by week, cache_node_type, region, and account_id.
- CTE: new_types
- This CTE identifies the new node types introduced each week by comparing the current week to the previous week.
- It selects data from cluster_weeks for the current week (current_week) and performs a left join with cluster_weeks for the previous week (previous_week) based on week, cache_node_type, region, and account_id.
- The where clause ensures that only node types not present in the previous week are included.
- Final Selection
- The final select statement retrieves the current_week, cache_node_type, region, and account_id from the new_types CTE.
- The order by clause sorts the results by current_week, cache_node_type, region, and account_id.
Example Result #
current_week | cache_node_type | region | account_id |
---|---|---|---|
2024-01-01 00:00:00 | cache.m5.large | us-east-1 | account2 |
2024-01-01 00:00:00 | cache.t2.micro | us-west-1 | account1 |
2024-01-08 00:00:00 | cache.r5.large | us-west-2 | account3 |
2024-01-15 00:00:00 | cache.t3.medium | us-west-1 | account1 |
The table above displays the new types of cache nodes introduced each week. Each row represents a new node type detected for a specific week, region, and account. This helps you identify changes in the types of resources being used, allowing for better tracking and management of resource updates and optimizations.
How to Track New ElastiCache Node Types by Day #
If you want to check the introduction of new node types on a daily basis rather than weekly, you can use the following query. This provides a more granular view of changes in your ElastiCache clusters, helping you monitor and react to changes more promptly.
WITH cluster_days AS (
SELECT
date_trunc('day', _cq_sync_time) AS day,
cache_node_type,
region,
account_id
FROM
aws_elasticache_clusters
WHERE
cache_node_type != '' -- Filter out empty cluster types
GROUP BY
day,
cache_node_type,
region,
account_id
),
new_types AS (
SELECT
current_day.day AS current_day,
current_day.cache_node_type,
current_day.region,
current_day.account_id
FROM
cluster_days current_day
LEFT JOIN
cluster_days previous_day
ON
previous_day.day = current_day.day - interval '1 day'
AND previous_day.cache_node_type = current_day.cache_node_type
AND previous_day.region = current_day.region
AND previous_day.account_id = current_day.account_id
WHERE
previous_day.cache_node_type IS NULL -- Identify new types
)
SELECT
current_day,
cache_node_type,
region,
account_id
FROM
new_types
ORDER BY
current_day, cache_node_type, region, account_id;
Summary #
Tracking ElastiCache node count and type changes in AWS ElastiCache clusters is crucial for optimizing resource management, improving performance, and conducting historical analysis. By using CloudQuery to export your AWS data to a PostgreSQL database, you can efficiently analyze and visualize these changes. If you want to gain immediate insights into your AWS environment, try CloudQuery for free) or connect with our engineering team for further assistance).