AWS Compute Optimizer

A service that uses machine learning to analyze CloudWatch metrics and provide rightsizing recommendations for EC2, EBS, Lambda, and ECS resources

Overview

AWS Compute Optimizer analyzes metrics such as CPU utilization, memory utilization, and network I/O collected from CloudWatch using machine learning models, and provides rightsizing recommendations for EC2 instances, EBS volumes, Lambda functions, and ECS tasks. It not only detects over-provisioned resources and highlights cost reduction opportunities, but also warns about performance degradation risks from under-provisioned resources. It additionally provides Savings Plans and Reserved Instances purchase recommendations, with concrete dollar estimates of discount rates compared to on-demand pricing.

How ML Models Derive Optimal Configurations from Historical Metrics

The recommendation accuracy of Compute Optimizer is powered by machine learning models that take CloudWatch metric time-series data as input. By default, it analyzes the past 14 days of metrics, but enabling Enhanced Infrastructure Metrics extends the lookback window to up to 93 days, allowing recommendations that account for long-term load patterns such as month-end batch processing or quarterly financial spikes. For EC2 instances, it comprehensively evaluates not just CPU utilization but also memory utilization (via CloudWatch Agent), disk I/O, and network bandwidth, then suggests instance types that maintain equivalent or better performance at lower cost. For example, if an m5.xlarge consistently shows CPU utilization below 15%, it recommends downsizing to m5.large and displays the specific annual savings. Recommendations are labeled with three tiers - Over-provisioned, Under-provisioned, and Optimized - making it intuitive to prioritize which resources to address first.

Often-Overlooked Points in Lambda and ECS Rightsizing

Compute Optimizer provides recommendations not only for EC2 but also for Lambda function memory settings and ECS on Fargate task sizes. For Lambda, increasing memory proportionally allocates more CPU power, so even when memory usage is low, increasing memory for CPU-bound workloads can improve both latency and cost. Compute Optimizer accounts for this characteristic and suggests optimal memory sizes based on both execution duration and memory usage. For ECS tasks, if the actual utilization of the vCPU and memory combination configured for Fargate is low, it recommends switching to a smaller task size. Azure offers similar resource optimization through Azure Advisor's cost recommendations, but unlike Compute Optimizer's ML-model-based analysis of long-term metrics, Azure Advisor relies on simpler threshold-based assessments. Books on cloud cost management (Amazon) provide systematic frameworks for rightsizing decisions.

Savings Plans Recommendations and Organization-Wide Rollout

Compute Optimizer provides not only individual resource sizing recommendations but also Savings Plans purchase recommendations. It calculates the optimal Savings Plans commitment amount for 1-year or 3-year terms based on historical usage patterns and displays the discount rate compared to on-demand pricing. When integrated with Organizations, you can view recommendations for all member accounts from the management account, eliminating the need to opt in account by account. A practical caveat is that Compute Optimizer recommendations are predictions based on historical metrics and do not account for future workload changes such as traffic increases from new feature launches or resource reductions from service decommissioning. Before applying recommendations, it is important to establish a process where the team reviews upcoming roadmap items and planned architecture changes to determine whether the recommendations are appropriate. After applying recommendations, a staged approach of monitoring performance metrics on CloudWatch dashboards for 1-2 weeks before proceeding to the next resource is the safest practice.

共有するXB!