EC2 Autoscaling – AWS Technologies Blog

EC2 Auto Scaling is a service in Amazon Web Services (AWS) that automatically adjusts the number of EC2 instances in an Auto Scaling group based on your specified policies. The goal of EC2 Auto Scaling is to ensure that you have the right number of EC2 instances running to handle the load for your application, while also minimizing costs by scaling down when demand is low.

Auto Scaling Group

Group of EC2 instances that Auto Scaling manages, when creating an Auto Scaling group, a launch configuration or launching template must be specified.

Launch Configurations

Instance configuration template that ASG group uses to launch instances. It specifies Id of AMI, instance type, key pair and one or more security groups.

It cannot be modified after creation, and cannot be used to launch instances manually.

Launch Templates

Similar to Launch configurations but can be modified after creation and supports versions.

It includes all the same parameters as Launch Configurations.

Can also be used to spin up one-off EC2 or a spot fleet.

Configuring Scaling Policies

Scaling policies

Manual Scaling

You can manually adjust the number of EC2 instances in your Auto Scaling group at any time. This process of changing the instance count manually is referred to as manual scaling. Manual scaling is an alternative to auto scaling, especially if you want to make one-time capacity changes.

Scheduling scaling

With scheduled scaling, you can set up automatic scaling for your application based on predictable load changes. You create scheduled actions that increase or decrease your group’s desired capacity at specific times.

For example, you experience a regular weekly traffic pattern where load increases midweek and declines toward the end of the week. You can configure a scaling schedule in Amazon EC2 Auto Scaling that aligns with this pattern:

On Wednesday morning, one scheduled action increases capacity by increasing the previously set desired capacity of the Auto Scaling group.
On Friday evening, another scheduled action decreases capacity by decreasing the previously set desired capacity of the Auto Scaling group.

Dynamic scaling

a) Simple scaling

Whenever the metric rises above the threshold AS will simple increase the desired capacity.

Adjustment types for simple (and step) scaling:

PercentChangeInCapacity — Increment or decrement the current capacity of the group by the specified percentage. A positive value increases the capacity and a negative value decreases the capacity. For example: If the current capacity is 10 and the adjustment is 10 percent, then when this policy is performed, we add 1 capacity unit to the capacity for a total of 11 capacity units.

ChangeInCapacity — Increment or decrement the current capacity of the group by the specified value. A positive value increases the capacity and a negative adjustment value decreases the capacity. For example: If the current capacity of the group is 3 and the adjustment is 5, then when this policy is performed, we add 5 capacity units to the capacity for a total of 8 capacity units.

ExactCapacity — Change the current capacity of the group to the specified value. Specify a non-negative value with this adjustment type. For example: If the current capacity of the group is 3 and the adjustment is 5, then when this policy is performed, we change the capacity to 5 capacity units.

b) Step scaling

You want to add more instances based on how much the aggregate metric exceeds the threshold.

Each step adjustment consists of the following:

a lower bound
an upper bound
the adjustment type
the amount to increase/decrease the desired capacity.

For step scaling, you can optionally specify the number of seconds that it takes for a newly launched instance to warm up. Until its specified warmup time has expired, an instance is not counted toward the aggregated EC2 instance metrics of the Auto Scaling group.

c) Target scaling

Select a metric and a target value and Auto Scaling will create a CloudWatch Alarm and a scaling policy to adjust the number of instances to keep the metric near the target.

d) Scaling cooldowns

After your Auto Scaling group launches or terminates instances, it waits for a cooldown period to end before any further scaling activities initiated by simple scaling policies can start. The intention of the cooldown period is to let your Auto Scaling group stabilize and prevent it from launching or terminating additional instances before the effects of the previous scaling activity are visible.

The default cooldown period for dynamic scaling policies in AWS services is typically 300 seconds (5 minutes)

Predictive scaling

Predictive scaling works by analyzing historical load data to detect daily or weekly patterns in traffic flows. It uses this information to forecast future capacity needs so Amazon EC2 Auto Scaling can proactively increase the capacity of your Auto Scaling group to match the anticipated load.

Predictive scaling is well suited for situations where you have:

Cyclical traffic, such as high use of resources during regular business hours and low use of resources during evenings and weekends
Recurring on-and-off workload patterns, such as batch processing, testing, or periodic data analysis
Applications that take a long time to initialize, causing a noticeable latency impact on application performance during scale-out events