Step scaling policy for ECS service based on CPU consumption

Nathan Peck profile picture
Nathan Peck
Senior Developer Advocate at AWS

About

Auto scaling is very important for ensuring that your services can stay online when traffic increases unexpectedly. In both EC2 and AWS Fargate you can configure Amazon ECS to automatically increase and decrease the number of copies of your application container that are running in the cluster.

Architecture

This is how auto scaling works:

Your ContainerECS AgentTelemetryAmazon CloudWatchAWS ApplicationAuto ScalingAmazon Elastic Container Service (Amazon ECS)Morecopiesofyourcontainerlaunched

  1. Your application container uses CPU, memory, and other computing resources
  2. An ECS agent running on the same EC2 instance or AWS Fargate task gathers telemetry from your application container's usage statistics
  3. Telemetry is stored in AWS CloudWatch metrics
  4. AWS Application Auto Scaling triggers scaling rules based on CloudWatch metrics
  5. Amazon ECS receives an UpdateService call from AWS Application Auto Scaling, which adjusts the desired count for the service
  6. Amazon ECS launches additional copies of your application container on EC2 or AWS Fargate, or scales in the service to reduce the number of copies of your application, when there is no utilization.

CloudFormation Template

The following template automatically sets up CloudWatch alarms, auto scaling policies, and attaches them to an ECS service.

File: scale-service-by-cpu.ymlLanguage: yml
AWSTemplateFormatVersion: '2010-09-09'
Description: Add autoscaling rules that scale an ECS service based on CPU utilization
Parameters:
  ClusterName:
    Type: String
    Default: default
    Description: The cluster that is running the service you want to scale
  ServiceName:
    Type: String
    Default: nginx
    Description: The name of the service to scale

Resources:

  # Role that Application Auto Scaling will use to interact with
  # CloudWatch and Amazon ECS
  AutoscalingRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service: [application-autoscaling.amazonaws.com]
          Action: ['sts:AssumeRole']
      Path: /
      Policies:
      - PolicyName: service-autoscaling
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action:
              - 'application-autoscaling:*'
              - 'cloudwatch:DescribeAlarms'
              - 'cloudwatch:PutMetricAlarm'
              - 'ecs:DescribeServices'
              - 'ecs:UpdateService'
            Resource: '*'

  # Enable autoscaling for the service
  ScalableTarget:
    Type: AWS::ApplicationAutoScaling::ScalableTarget
    Properties:
      ServiceNamespace: 'ecs'
      ScalableDimension: 'ecs:service:DesiredCount'
      ResourceId: !Sub 'service/${ClusterName}/${ServiceName}'
      MinCapacity: 2
      MaxCapacity: 10
      RoleARN: !GetAtt AutoscalingRole.Arn

  # Create scaling policies that describe how to scale the service up and down.
  ScaleDownPolicy:
    Type: AWS::ApplicationAutoScaling::ScalingPolicy
    DependsOn: ScalableTarget
    Properties:
      PolicyName: !Sub scale-${ClusterName}-${ServiceName}-down
      PolicyType: StepScaling
      ResourceId: !Sub 'service/${ClusterName}/${ServiceName}'
      ScalableDimension: 'ecs:service:DesiredCount'
      ServiceNamespace: 'ecs'
      StepScalingPolicyConfiguration:
        AdjustmentType: 'ChangeInCapacity'
        StepAdjustments:
          - MetricIntervalUpperBound: 0
            ScalingAdjustment: -1
        MetricAggregationType: 'Average'
        Cooldown: 60

  ScaleUpPolicy:
    Type: AWS::ApplicationAutoScaling::ScalingPolicy
    DependsOn: ScalableTarget
    Properties:
      PolicyName: !Sub scale-${ClusterName}-${ServiceName}-up
      PolicyType: StepScaling
      ResourceId: !Sub 'service/${ClusterName}/${ServiceName}'
      ScalableDimension: 'ecs:service:DesiredCount'
      ServiceNamespace: 'ecs'
      StepScalingPolicyConfiguration:
        AdjustmentType: 'ChangeInCapacity'
        StepAdjustments:
          - MetricIntervalLowerBound: 0
            MetricIntervalUpperBound: 15
            ScalingAdjustment: 1
          - MetricIntervalLowerBound: 15
            MetricIntervalUpperBound: 25
            ScalingAdjustment: 2
          - MetricIntervalLowerBound: 25
            ScalingAdjustment: 3
        MetricAggregationType: 'Average'
        Cooldown: 60

  # Create alarms to trigger the scaling policies
  LowCpuUsageAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: !Sub low-cpu-${ClusterName}-${ServiceName}
      AlarmDescription: !Sub "Low CPU utilization for service ${ServiceName} in cluster ${ClusterName}"
      MetricName: CPUUtilization
      Namespace: AWS/ECS
      Dimensions:
        - Name: ServiceName
          Value: !Ref 'ServiceName'
        - Name: ClusterName
          Value: !Ref 'ClusterName'
      Statistic: Average
      Period: 60
      EvaluationPeriods: 1
      Threshold: 20
      ComparisonOperator: LessThanOrEqualToThreshold
      AlarmActions:
        - !Ref ScaleDownPolicy

  HighCpuUsageAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: !Sub high-cpu-${ClusterName}-${ServiceName}
      AlarmDescription: !Sub "High CPU utilization for service ${ServiceName} in cluster ${ClusterName}"
      MetricName: CPUUtilization
      Namespace: AWS/ECS
      Dimensions:
        - Name: ServiceName
          Value: !Ref 'ServiceName'
        - Name: ClusterName
          Value: !Ref 'ClusterName'
      Statistic: Average
      Period: 60
      EvaluationPeriods: 1
      Threshold: 70
      ComparisonOperator: GreaterThanOrEqualToThreshold
      AlarmActions:
        - !Ref ScaleUpPolicy

The template requires the following input parameters:

  • ClusterName - The name of the ECS cluster that runs the service you would like to scale
  • ServiceName - The name of the service you want to scale

Things to note in this template:

  • HighCpuUsageAlarm.Properties.MetricName - The metric name to scale on. This is scaling based on CPU utilization.
  • HighCpuUsageAlarm.Properties.Threshold - The CPU utilization threshold at which to start applying scaling policies. In this case it is set to 70% to provide some headroom for small deployments to absorb spikes of incoming traffic. The larger your service is the closer you can push this to 100%.
  • ScaleUpPolicy.Properties.StepScalingPolicyConfiguration - This controls the behavior for how fast to scale up based on how far out on bounds the metric is. The more CPU goes above the target utiliation the faster ECS will launch additional tasks to try to bring the CPU utilization back in bounds.

TIP

Note that this example CloudFormation template is scaling based on CPU Utilization. This is the correct way to scale for almost all application frameworks. Be careful about scaling based on memory utilization because with most application runtime frameworks memory is not correlated with utilization. Most applications don't release memory after load decreases. Instead they keep the memory allocated in case they need to use it again. So scaling on memory utilization may scale out but never scale back down.

Usage

You can deploy the template via the AWS CloudFormation web console, or by running an AWS CLI command similar to this:

Language: shell
aws cloudformation deploy \
   --stack-name scale-my-service-name \
   --template-file scale-service-by-cpu.yml \
   --capabilities CAPABILITY_IAM \
   --parameter-overrides ClusterName=development ServiceName=my-web-service

Cleanup

You can delete the auto scaling configuration by tearing down the CloudFormation stack with:

Language: shell
aws cloudformation delete-stack --stack-name scale-my-service-name

See Also