Amazon ECS cluster with isolated VPC and no NAT Gateway

Nathan Peck profile picture
Nathan Peck
Senior Developer Advocate at AWS

Terminology

Amazon Elastic Container Service (ECS) is a serverless orchestrator that manages container deployments on your behalf.

Amazon Virtual Private Cloud (VPC) helps you define and launch AWS resources in a logically isolated virtual network.

AWS PrivateLink helps establish connectivity between VPC's and AWS services without exposing data to the internet.

In this pattern you will learn how to setup a private, isolated container workload, orchestrated by Amazon ECS. Containers will run in an isolated VPC that has no internet access at all. Access to foundational AWS services will provided via AWS PrivateLink.

Why?

A fully isolated VPC is used in the following situations:

  • You wish to avoid all possibility of dangerous inbound communications from the internet. An isolated VPC does not even have an internet gateway that would allow inbound traffic to reach your workloads.
  • You want to avoid the possibility of data exfiltration. The isolated VPC does not have a NAT gateway or other route to the public internet. Data can only be exfiltrated via AWS services like S3, or similar. Therefore it is a lot easier to lock down the flow of data out of the network as well.
  • You want to avoid using public IP addresses at all. In the isolated network there is no public IP address usage whatsoever.

Architecture

The following diagram depicts what you will deploy:

Private subnetPrivate subnetVPCAWS FargateContainerContainerAWS PrivateLinkAmazon Elastic Container Service (Amazon ECS)Amazon Simple Storage Service (Amazon S3)Amazon CloudWatchAWS Systems ManagerAWS Secrets ManagerAmazon Elastic Container Registry (Amazon ECR)

  • The deployed VPC is exclusively made up of private subnets. There are no public subnets, therefore there is no public IP address usage, no internet gateway, no NAT gatways, and no inbound or outbound internet access at all.
  • In order to have access to the required AWS services, the VPC has PrivateLink endpoints and an S3 gateway. The following endpoints are included out of the box:
    • com.amazonaws.<region>.ecr.api - Access to the Elastic Container Registry API, used for downloading container images
    • com.amazonaws.<region>.ecr.dkr - Access to the Docker endpoint for ECR, used for downloading container images
    • com.amazonaws.<region>.secretsmanager - Access to Secrets Manager, if you use secrets in your ECS task definition
    • com.amazonaws.<region>.systemsmanager - This allows you to use Amazon ECS Exec to open connections to an interactive shell inside the task.
    • com.amazonaws.<region>.logs - Access to upload the container logs
    • com.amazonaws.<region>.s3 - Gateway endpoint for access to download the container image layers themselves
  • The following optional endpoints are also included, but disabled by default as they are not needed for an AWS Fargate based deployment. You can enable these endpoints if you intend to deploy ECS tasks on EC2 capacity:
    • com.amazonaws.<region>.ecs
    • com.amazonaws.<region>.ecs-agent
    • com.amazonaws.<region>.ecs-telemetry

Dependencies

This pattern uses AWS SAM CLI for deploying CloudFormation stacks on your account. You should follow the appropriate steps for installing SAM CLI.

Define the isolated VPC

Download the isolated-vpc.yml file which defines the private VPC:

File: isolated-vpc.ymlLanguage: yml
AWSTemplateFormatVersion: '2010-09-09'
Description: This stack deploys an isolated VPC that has no internet access
             at all. It has additional PrivateLink endpoints designed to allow
             launching an Amazon ECS orchestrated container using ECS and it's
             supporting AWS services.
Parameters:
  DeployingToEC2:
    Type: String
    Default: false
    AllowedValues:
      - true
      - false
    Description: Set value to "true" in order to create additional ECS endpoints
                 to enable ECS on EC2 usage.

Conditions:
  CreateEcsOnEc2Resources: !Equals [ !Ref "DeployingToEC2", true ]

Mappings:
  # Hard values for the subnet masks. These masks define
  # the range of internal IP addresses that can be assigned.
  # The VPC can have all IP's from 10.0.0.0 to 10.0.255.255
  # There are four subnets which cover the ranges:
  #
  # 10.0.128.0 - 10.0.191.255 (16384 IP addresses)
  # 10.0.192.0 - 10.0.255.0 (16384 IP addresses)
  #
  # This template leaves some unutilized IP address space in the following
  # ranges in case you need to add public subnets in the future:
  #
  # 10.0.0.0 - 10.0.63.255 (16384 IP addresses)
  # 10.0.64.0 - 10.0.127.255 (16384 IP addresses)
  SubnetConfig:
    VPC:
      CIDR: '10.0.0.0/16'
    PrivateOne:
      CIDR: '10.0.128.0/18'
    PrivateTwo:
      CIDR: '10.0.192.0/18'
Resources:
  # VPC in which containers will be networked.
  # It has two public subnets, and two private subnets.
  # We distribute the subnets across the first two available subnets
  # for the region, for high availability.
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      EnableDnsSupport: true
      EnableDnsHostnames: true
      CidrBlock: !FindInMap ['SubnetConfig', 'VPC', 'CIDR']

  # Two private subnets where containers will only have private
  # IP addresses, and will only be reachable by other members of the
  # VPC
  PrivateSubnetOne:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone:
         Fn::Select:
         - 0
         - Fn::GetAZs: {Ref: 'AWS::Region'}
      VpcId: !Ref VPC
      CidrBlock: !FindInMap ['SubnetConfig', 'PrivateOne', 'CIDR']
  PrivateSubnetTwo:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone:
         Fn::Select:
         - 1
         - Fn::GetAZs: {Ref: 'AWS::Region'}
      VpcId: !Ref VPC
      CidrBlock: !FindInMap ['SubnetConfig', 'PrivateTwo', 'CIDR']

  # The route table describes how resources in the VPC will be able to reach
  # various internet endpoints or address ranges.
  RouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC
  PrivateSubnetOneRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref RouteTable
      SubnetId: !Ref PrivateSubnetOne
  PrivateSubnetTwoRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref RouteTable
      SubnetId: !Ref PrivateSubnetTwo

  # PrivateLink security group. Note that we share one security group
  # for all of the PrivateLink endpoints. This is in order to more easily
  # grant ECS managed infrastructure permissions to utilize all of the
  # endpoints.
  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Shared security group.
      VpcId: !Ref VPC
      Tags:
        - Key: Name
          Value: !Sub ${AWS::StackName}-shared

  SecurityGroupAccessRule:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      IpProtocol: -1
      SourceSecurityGroupId: !Ref SecurityGroup
      GroupId: !Ref SecurityGroup

  # The PrivateLink endpoints that provide access to required AWS services
  S3Endpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Gateway
      RouteTableIds:
        - !Ref RouteTable
      ServiceName: !Sub com.amazonaws.${AWS::Region}.s3
      VpcId: !Ref VPC

  CloudWatchLogsEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.logs
      VpcId: !Ref VPC

  SsmEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ssm
      VpcId: !Ref VPC

  SsmMessagesEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ssmmessages
      VpcId: !Ref VPC

  EcrApiEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.api
      VpcId: !Ref VPC

  EcrDkrEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.dkr
      VpcId: !Ref VPC

  SecretsManagerEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.secretsmanager
      VpcId: !Ref VPC

  # The following endpoints with the Condition: CreateEcsOnEc2Resources
  # are not necessary for ECS on AWS Fargate, but are needed for
  # ECS on EC2
  EcsAgentEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Condition: CreateEcsOnEc2Resources
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecs-agent
      VpcId: !Ref VPC

  EcsTelemetryEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Condition: CreateEcsOnEc2Resources
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecs-telemetry
      VpcId: !Ref VPC

  EcsEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Condition: CreateEcsOnEc2Resources
    Properties:
      VpcEndpointType: Interface
      PrivateDnsEnabled: true
      SubnetIds:
        - !Ref PrivateSubnetOne
        - !Ref PrivateSubnetTwo
      SecurityGroupIds:
        - !Ref SecurityGroup
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecs
      VpcId: !Ref VPC

Outputs:
  VpcId:
    Description: The ID of the VPC that this stack is deployed in
    Value: !Ref VPC
  PrivateSubnetIds:
    Description: Comma seperated list of private subnets with no internet access
    Value: !Sub '${PrivateSubnetOne},${PrivateSubnetTwo}'
  PrivateLinkEndpointSecurityGroup:
    Description: The shared security group for all of the PrivateLink
                 endpoints. The ECS services and/or EC2 instances that host
                 those services must have permission to talk to this security group
    Value: !Ref SecurityGroup

Note that the following resources are not created by default:

  • EcsAgentEndpoint
  • EcsTelemetryEndpoint
  • EcsEndpoint

These endpoints are not necessary for an AWS Fargate based deployment. If you plan to deploy to EC2 capacity, you can enable these endpoints by modifying the DeployingToEC2 parameter on this template.

Define the cluster

Download the following cluster.yml to define the cluster that will host the container tasks:

File: cluster.ymlLanguage: yml
AWSTemplateFormatVersion: '2010-09-09'
Description: Empty ECS cluster that has no EC2 instances. It is designed
             to be used with AWS Fargate serverless capacity

Resources:
  # Cluster that keeps track of container deployments
  ECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
      ClusterSettings:
        - Name: containerInsights
          Value: enabled

  # This is a role which is used within Fargate to allow the Fargate agent
  # to download images, and upload logs.
  ECSTaskExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              Service: [ecs-tasks.amazonaws.com]
            Action: ['sts:AssumeRole']
            Condition:
              ArnLike:
                aws:SourceArn: !Sub arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:*
              StringEquals:
                aws:SourceAccount: !Ref AWS::AccountId
      Path: /

      # This role enables basic features of ECS. See reference:
      # https://docs.aws.amazon.com/AmazonECS/latest/developerguide/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonECSTaskExecutionRolePolicy
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

Outputs:
  ClusterName:
    Description: The ECS cluster into which to launch resources
    Value: !Ref ECSCluster
  ECSTaskExecutionRole:
    Description: The role used to start up a task
    Value: !Ref ECSTaskExecutionRole

Define the container workload

Download the following private-service.yml to define an ECS service deployed on AWS Fargate, with tasks hosted in a private VPC subnet.

File: private-service.ymlLanguage: yml
AWSTemplateFormatVersion: '2010-09-09'
Description: An example service that deploys in AWS VPC networking mode
             on AWS Fargate. Service runs with networking in private subnets
             and with private IP addresses only.

Parameters:
  VpcId:
    Type: String
    Description: The VPC that the service is running inside of
  PrivateSubnetIds:
    Type: List<AWS::EC2::Subnet::Id>
    Description: List of private subnet ID's to put the tasks in
  ClusterName:
    Type: String
    Description: The name of the ECS cluster into which to launch capacity.
  ECSTaskExecutionRole:
    Type: String
    Description: The role used to start up an ECS task
  ServiceName:
    Type: String
    Default: sample-service
    Description: A name for the service
  ImageUri:
    Type: String
    Description: The url of a container image that contains the application process
  ContainerCpu:
    Type: Number
    Default: 256
    Description: How much CPU to give the container. 1024 is 1 CPU
  ContainerMemory:
    Type: Number
    Default: 512
    Description: How much memory in megabytes to give the container
  DesiredCount:
    Type: Number
    Default: 2
    Description: How many copies of the service task to run
  PrivateLinkEndpointSecurityGroup:
    Type: String
    Description: The security group on the PrivateLink endpoints. It must accept traffic from the service's SG.

Resources:

  # The task definition. This is a simple metadata description of what
  # container to run, and what resource requirements it has.
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Ref ServiceName
      Cpu: !Ref ContainerCpu
      Memory: !Ref ContainerMemory
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      ExecutionRoleArn: !Ref ECSTaskExecutionRole
      ContainerDefinitions:
        - Name: !Ref ServiceName
          Cpu: !Ref ContainerCpu
          Memory: !Ref ContainerMemory
          Image: !Ref ImageUri
          LogConfiguration:
            LogDriver: 'awslogs'
            Options:
              mode: non-blocking
              max-buffer-size: 25m
              awslogs-group: !Ref LogGroup
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: !Ref ServiceName

  # The service. The service is a resource which allows you to run multiple
  # copies of a type of task, and gather up their logs and metrics, as well
  # as monitor the number of running tasks and replace any that have crashed
  Service:
    Type: AWS::ECS::Service
    Properties:
      ServiceName: !Ref ServiceName
      Cluster: !Ref ClusterName
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: DISABLED
          SecurityGroups:
            - !Ref ServiceSecurityGroup
          Subnets: !Ref PrivateSubnetIds
      DeploymentConfiguration:
        MaximumPercent: 200
        MinimumHealthyPercent: 75
      DesiredCount: !Ref DesiredCount
      TaskDefinition: !Ref TaskDefinition

  # Security group that limits network access
  # to the task
  ServiceSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Security group for service
      VpcId: !Ref VpcId

  # Open up the PrivateLink endpoints to accepting inbound traffic
  # from the service deploying in AWS Fargate.
  PrivateLinkIngressFromService:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      Description: Ingress from the services deployed in AWS Fargate
      GroupId: !Ref PrivateLinkEndpointSecurityGroup
      IpProtocol: -1
      SourceSecurityGroupId: !Ref ServiceSecurityGroup

  # This log group stores the stdout logs from this service's containers
  LogGroup:
    Type: AWS::Logs::LogGroup

Note that AssignPublicIp setting for the AWS::ECS::Service must be set to false, as the private subnets being used for deployment do not have any path to the internet and no capability to actually use public IP addresses.

Build and Push a Sample Container

When running an private service in an isolated VPC, it is not possible to pull sample images from a public registry on the public internet. Therefore, you must build and push your own private container image to run. The following instructions will guide you through this process.

Start by downloading the following Dockerfile that defines the container image:

File: DockerfileLanguage: Dockerfile
FROM public.ecr.aws/docker/library/busybox

# Just sleep for an hour
CMD ["busybox", "sleep", "3600"]

Then use the following commands to create a private ECR repository, build the container image, and then push the container image to the private repository:

INFO

The following script assumes that you already have the Amazon ECR credential helper installed in your dev environment. This credential helper will automatically obtain credentials for uploading private container images when needed, using your environment's AWS credentials or role.

Language: sh
REPO_URI=$(aws ecr create-repository --repository-name sample-app-repo --query 'repository.repositoryUri' --output text)
if [ -z "${REPO_URI}" ]; then
  REPO_URI=$(aws ecr describe-repositories --repository-names sample-app-repo --query 'repositories[0].repositoryUri' --output text)
fi
docker build -t ${REPO_URI}:sample .
docker push ${REPO_URI}:sample

Deploy it All

Download the following parent.yml which deploys the other reference templates:

File: parent.ymlLanguage: yml
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: Parent stack that deploys an isolated VPC and a private
             Amazon ECS service in that isolated VPC.
Parameters:
  ImageUri:
    Type: String
    Description: The URI of the private container image to deploy

Resources:

  # The networking configuration. This creates an isolated
  # network specific to this particular environment
  VpcStack:
    Type: AWS::Serverless::Application
    Properties:
      Location: isolated-vpc.yml

  # This stack defines the Amazon ECS cluster itself
  ClusterStack:
    Type: AWS::Serverless::Application
    Properties:
      Location: cluster.yml

  # This stack defines the container deployment
  ServiceStack:
    Type: AWS::Serverless::Application
    Properties:
      Location: private-service.yml
      Parameters:
        ImageUri: !Ref ImageUri
        VpcId: !GetAtt VpcStack.Outputs.VpcId
        PrivateSubnetIds: !GetAtt VpcStack.Outputs.PrivateSubnetIds
        PrivateLinkEndpointSecurityGroup: !GetAtt VpcStack.Outputs.PrivateLinkEndpointSecurityGroup
        ClusterName: !GetAtt ClusterStack.Outputs.ClusterName
        ECSTaskExecutionRole: !GetAtt ClusterStack.Outputs.ECSTaskExecutionRole

You should now have the following files locally:

  • parent.yml - Top level stack that deploys the child stacks
  • isolated-vpc.yml - Creates the isolated VPC with PrivateLink endpoints
  • cluster.yml - Creates the Amazon ECS cluster
  • private-service.yml - Creates a private service hosted in the isolated VPC.

Use the following command to deploy the entire infrastructure:

Language: sh
sam deploy \
  --template-file parent.yml \
  --stack-name isolated-vpc-environment \
  --capabilities CAPABILITY_IAM \
  --parameter-overrides ImageUri=${REPO_URI}:sample \
  --resolve-s3

After the stack deploys you can open the Amazon ECS console to verify that you are running two copies of a simple busybox based container.

Tear it Down

When you are done you can use the followin command to tear down the reference architecture:

Language: sh
sam delete --stack-name isolated-vpc-environment --no-prompts

Next Steps

This architecture deliberately excludes ingress from the public internet. If you do have a workload where you want both network isolation and a limited amount of internet traffic ingress consider deploying an API Gateway using the approach from the pattern: "Serverless API Gateway Ingress for AWS Fargate, in CloudFormation". This approach can be adopted to get serverless internet ingress without any public subnets at all, by creating an AWS::ApiGatewayV2::VpcLink to the private subnets.

If you require access to additional AWS services you may need to add additional PrivateLink endpoints. This reference is designed to include only the most minimal set of AWS services required to have a functional Amazon ECS based deployment.

Alternative Patterns

Not quite right for you? Try another way to do this:

Infrastructure Pattern  Large sized AWS VPC for an Amazon ECS cluster

A VPC that provides access to the internet via AWS managed NAT Gateway.

Infrastructure Pattern  Dual-stack IPv6 networking for Amazon ECS and AWS Fargate

A dual-stack VPC that has support for both IPv4 and IPv6.