Large sized AWS VPC for an Amazon ECS cluster

Nathan Peck profile picture
Nathan Peck
Senior Developer Advocate at AWS

About

Amazon Virtual Private Cloud (Amazon VPC) gives you full control over your virtual networking environment, including resource placement, connectivity, and security.

The recommended way to configure networking for containers in a Amazon ECS cluster is using VPC networking mode. In this mode ECS gives each task that you start it's own unique private IP address in your VPC. There are significant benefits to this, such as the ability to give your tasks VPC security groups that allow you granular control over container to container communication, even when tasks are running colocated on the same EC2 instance. Additionally, when deploying containers using AWS Fargate you are required to use the VPC networking mode.

One challenge of deploying containers in VPC networking mode is that you must provision a large enough VPC to hold all your containers. Otherwise when you attempt to scale up you will run out of IP address space in the VPC and network interface provisioning will fail.

Additionally, containers that run in VPC mode on EC2 will only be given private IP addresses. Therefore the VPC requires additional networking configuration to for containerized tasks to be able to communicate with the public internet.

This pattern creates a large VPC with room to host tens of thousands of containers on AWS Fargate or EC2 instances. It also configures the VPC with NAT gateways for the private subents to ensure that you have full outbound internet access in all subnets of the VPC.

Architecture

The following diagram shows the architecture of what will be created:

Private subnetVPCPrivate subnetPublic subnetPublic subnetInternet gatewayNAT gatewayNAT gatewayRoute tableOutbound internet trafficOutbound internet traffic

  • The VPC that is created spans two availability zones. This gives you increased availability.
  • Each AZ gets a public subnet and a private subnet.
  • The VPC has an internet gateway that can be used from the public subnets by any container or compute that has a public IP address.
  • This pattern creates two NAT gateways that provide internet access to resources launching in the private subnets.

INFO

If you would prefer a fully isolated VPC, with no inbound or outbound internet access, you should use the VPC pattern "Amazon ECS cluster with isolated VPC and no NAT Gateway". This alternative pattern utilizes AWS PrivateLink endpoints to provide secure access to the AWS services that are required for Amazon ECS functionality.

Subnet Compatibility

For this example VPC the following table shows subnet support for internet access and networking across each capacity and networking mode.

ConfigurationPrivate SubnetsPublic Subnets
EC2 Bridge mode
EC2 Host mode
EC2 AWS VPC❌ (not supported, EC2 tasks don't have public IP's)
Fargate AWS VPC❗ (requires assign public IP)

DANGER

Note that when using AWS VPC networking mode on EC2 it is not supported to place tasks in the public subnet, because the task ENI only has a private IP address. In the public subnet outbound networking traffic will go directly to the internet gateway, however because the task has no public IP address there is no return path to the task.

WARNING

AWS Fargate tasks can be launched with "assign public IP" turned on. This allows tasks to be launched in a public subnet and use the internet gateway directly. But if you don't turn on public IP assignment then internet access will not work properly.

VPC Configuration

Deploy the following CloudFormation template to create the VPC:

File: vpc.ymlLanguage: yml
AWSTemplateFormatVersion: '2010-09-09'
Description: This stack deploys a large AWS VPC with internet access
Mappings:
  # Hard values for the subnet masks. These masks define
  # the range of internal IP addresses that can be assigned.
  # The VPC can have all IP's from 10.0.0.0 to 10.0.255.255
  # There are four subnets which cover the ranges:
  #
  # 10.0.0.0 - 10.0.63.255 (16384 IP addresses)
  # 10.0.64.0 - 10.0.127.255 (16384 IP addresses)
  # 10.0.128.0 - 10.0.191.255 (16384 IP addresses)
  # 10.0.192.0 - 10.0.255.0 (16384 IP addresses)
  #
  SubnetConfig:
    VPC:
      CIDR: '10.0.0.0/16'
    PublicOne:
      CIDR: '10.0.0.0/18'
    PublicTwo:
      CIDR: '10.0.64.0/18'
    PrivateOne:
      CIDR: '10.0.128.0/18'
    PrivateTwo:
      CIDR: '10.0.192.0/18'
Resources:
  # VPC in which containers will be networked.
  # It has two public subnets, and two private subnets.
  # We distribute the subnets across the first two available subnets
  # for the region, for high availability.
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      EnableDnsSupport: true
      EnableDnsHostnames: true
      CidrBlock: !FindInMap ['SubnetConfig', 'VPC', 'CIDR']

  # Two public subnets, where containers can have public IP addresses
  PublicSubnetOne:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone:
         Fn::Select:
         - 0
         - Fn::GetAZs: {Ref: 'AWS::Region'}
      VpcId: !Ref 'VPC'
      CidrBlock: !FindInMap ['SubnetConfig', 'PublicOne', 'CIDR']
      MapPublicIpOnLaunch: true
  PublicSubnetTwo:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone:
         Fn::Select:
         - 1
         - Fn::GetAZs: {Ref: 'AWS::Region'}
      VpcId: !Ref 'VPC'
      CidrBlock: !FindInMap ['SubnetConfig', 'PublicTwo', 'CIDR']
      MapPublicIpOnLaunch: true

  # Two private subnets where containers will only have private
  # IP addresses, and will only be reachable by other members of the
  # VPC
  PrivateSubnetOne:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone:
         Fn::Select:
         - 0
         - Fn::GetAZs: {Ref: 'AWS::Region'}
      VpcId: !Ref 'VPC'
      CidrBlock: !FindInMap ['SubnetConfig', 'PrivateOne', 'CIDR']
  PrivateSubnetTwo:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone:
         Fn::Select:
         - 1
         - Fn::GetAZs: {Ref: 'AWS::Region'}
      VpcId: !Ref 'VPC'
      CidrBlock: !FindInMap ['SubnetConfig', 'PrivateTwo', 'CIDR']

  # Setup networking resources for the public subnets. Containers
  # in the public subnets have public IP addresses and the routing table
  # sends network traffic via the internet gateway.
  InternetGateway:
    Type: AWS::EC2::InternetGateway
  GatewayAttachement:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      VpcId: !Ref 'VPC'
      InternetGatewayId: !Ref 'InternetGateway'
  PublicRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref 'VPC'
  PublicRoute:
    Type: AWS::EC2::Route
    DependsOn: GatewayAttachement
    Properties:
      RouteTableId: !Ref 'PublicRouteTable'
      DestinationCidrBlock: '0.0.0.0/0'
      GatewayId: !Ref 'InternetGateway'
  PublicSubnetOneRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnetOne
      RouteTableId: !Ref PublicRouteTable
  PublicSubnetTwoRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnetTwo
      RouteTableId: !Ref PublicRouteTable

  # Setup networking resources for the private subnets. Containers
  # in these subnets have only private IP addresses, and must use a NAT
  # gateway to talk to the internet. We launch two NAT gateways, one for
  # each private subnet.
  NatGatewayOneAttachment:
    Type: AWS::EC2::EIP
    DependsOn: GatewayAttachement
    Properties:
        Domain: vpc
  NatGatewayTwoAttachment:
    Type: AWS::EC2::EIP
    DependsOn: GatewayAttachement
    Properties:
        Domain: vpc
  NatGatewayOne:
    Type: AWS::EC2::NatGateway
    Properties:
      AllocationId: !GetAtt NatGatewayOneAttachment.AllocationId
      SubnetId: !Ref PublicSubnetOne
  NatGatewayTwo:
    Type: AWS::EC2::NatGateway
    Properties:
      AllocationId: !GetAtt NatGatewayTwoAttachment.AllocationId
      SubnetId: !Ref PublicSubnetTwo
  PrivateRouteTableOne:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref 'VPC'
  PrivateRouteOne:
    Type: AWS::EC2::Route
    Properties:
      RouteTableId: !Ref PrivateRouteTableOne
      DestinationCidrBlock: 0.0.0.0/0
      NatGatewayId: !Ref NatGatewayOne
  PrivateRouteTableOneAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref PrivateRouteTableOne
      SubnetId: !Ref PrivateSubnetOne
  PrivateRouteTableTwo:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref 'VPC'
  PrivateRouteTwo:
    Type: AWS::EC2::Route
    Properties:
      RouteTableId: !Ref PrivateRouteTableTwo
      DestinationCidrBlock: 0.0.0.0/0
      NatGatewayId: !Ref NatGatewayTwo
  PrivateRouteTableTwoAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref PrivateRouteTableTwo
      SubnetId: !Ref PrivateSubnetTwo

Outputs:
  VpcId:
    Description: The ID of the VPC that this stack is deployed in
    Value: !Ref 'VPC'
  PublicSubnetIds:
    Description: Comma seperated list of public facing subnets that have
                 a direct internet connection as long as you assign a public IP
    Value: !Sub '${PublicSubnetOne},${PublicSubnetTwo}'
  PrivateSubnetIds:
    Description: Comma seperated list of private subnets that use a NAT
                 gateway for internet access.
    Value: !Sub '${PrivateSubnetOne},${PrivateSubnetTwo}'

Some things to note:

This pattern VPC has two public subnets, each with 16,384 addresses. These subnets should be used to hosting public facing load balancers, or other similar resources that are intended to accept direct inbound traffic from the internet.

This pattern VPC has two private subnets, each with 16,384 addresses. These subnets should host the underlying EC2 instances and containers that you wish to protect from direct internet access. All of their outbound internet communications will be proxied through two NAT gateways that are hosted in the public subnets.

Amazon Virtual Private Cloud reserves the first four IP addresses and the last IP address in each subnet CIDR block for it's own use. The other 65,516 IP addresses in the VPC are available for your containers and/or EC2 instances.

If you are planning to run an incredibly large workload then keep an eye on the Network Address Usage (NAU) metric and quota for your AWS account. You may need to request an increase to your NAU quota.

Usage

Deploy the template via the AWS CloudFormation console, or with a CLI command like this:

Language: shell
aws cloudformation deploy \
   --stack-name big-vpc \
   --template-file vpc.yml

The deployed template has Outputs that you can pass into other stacks:

  • VpcId - Many other AWS resources will need to know the ID of the VPC that they are placed in.
  • PublicSubnetIds - A comma separated list of the subnet ID's that have direct internet access.
  • PrivateSubnetIds - A comma separates list of the subnet ID's that have internet access via a NAT gateway.

Next Steps

  • This template only provisions two subnets. For even greater availability consider adding a third public and private subnet and NAT gateway.
  • If your private subnet hosted resources make heavy use of AWS services such as DynamoDB, S3, or other services, then consider adding VPC endpoints for those services. This will remove the need for that traffic to go through the NAT gateway, freeing up it's capacity for other usage, and potentially reducing your networking costs.
  • If this VPC still looks too small for your workload then consider splitting it up across multiple smaller VPC's.
  • If you do not wish to pay for NAT gateways then consider the low cost VPC for Amazon ECS. Note that this VPC choice does limit your capabilities when running on EC2 instances, although you can still use AWS Fargate capacity without NAT gateway.

Alternative Patterns

Not quite right for you? Try another way to do this:

Infrastructure Pattern  Amazon ECS cluster with isolated VPC and no NAT Gateway

A completely isolated VPC network, with no access to the internet.

Infrastructure Pattern  Dual-stack IPv6 networking for Amazon ECS and AWS Fargate

A dual-stack VPC that has support for both IPv4 and IPv6.