Large sized AWS VPC for an Amazon ECS cluster
About
Amazon Virtual Private Cloud (Amazon VPC) gives you full control over your virtual networking environment, including resource placement, connectivity, and security.
The recommended way to configure networking for containers in a Amazon ECS cluster is using VPC networking mode. In this mode ECS gives each task that you start it's own unique private IP address in your VPC. There are significant benefits to this, such as the ability to give your tasks VPC security groups that allow you granular control over container to container communication, even when tasks are running colocated on the same EC2 instance. Additionally, when deploying containers using AWS Fargate you are required to use the VPC networking mode.
One challenge of deploying containers in VPC networking mode is that you must provision a large enough VPC to hold all your containers. Otherwise when you attempt to scale up you will run out of IP address space in the VPC and network interface provisioning will fail.
Additionally, containers that run in VPC mode on EC2 will only be given private IP addresses. Therefore the VPC requires additional networking configuration to for containerized tasks to be able to communicate with the public internet.
This pattern creates a large VPC with room to host tens of thousands of containers on AWS Fargate or EC2 instances. It also configures the VPC with NAT gateways for the private subents to ensure that you have full outbound internet access in all subnets of the VPC.
Architecture
The following diagram shows the architecture of what will be created:
- The VPC that is created spans two availability zones. This gives you increased availability.
- Each AZ gets a public subnet and a private subnet.
- The VPC has an internet gateway that can be used from the public subnets by any container or compute that has a public IP address.
- This pattern creates two NAT gateways that provide internet access to resources launching in the private subnets.
INFO
If you would prefer a fully isolated VPC, with no inbound or outbound internet access, you should use the VPC pattern "Amazon ECS cluster with isolated VPC and no NAT Gateway". This alternative pattern utilizes AWS PrivateLink endpoints to provide secure access to the AWS services that are required for Amazon ECS functionality.
Subnet Compatibility
For this example VPC the following table shows subnet support for internet access and networking across each capacity and networking mode.
Configuration | Private Subnets | Public Subnets |
---|---|---|
EC2 Bridge mode | ✅ | ✅ |
EC2 Host mode | ✅ | ✅ |
EC2 AWS VPC | ✅ | ❌ (not supported, EC2 tasks don't have public IP's) |
Fargate AWS VPC | ✅ | ❗ (requires assign public IP) |
DANGER
Note that when using AWS VPC networking mode on EC2 it is not supported to place tasks in the public subnet, because the task ENI only has a private IP address. In the public subnet outbound networking traffic will go directly to the internet gateway, however because the task has no public IP address there is no return path to the task.
WARNING
AWS Fargate tasks can be launched with "assign public IP" turned on. This allows tasks to be launched in a public subnet and use the internet gateway directly. But if you don't turn on public IP assignment then internet access will not work properly.
VPC Configuration
Deploy the following CloudFormation template to create the VPC:
AWSTemplateFormatVersion: '2010-09-09'
Description: This stack deploys a large AWS VPC with internet access
Mappings:
# Hard values for the subnet masks. These masks define
# the range of internal IP addresses that can be assigned.
# The VPC can have all IP's from 10.0.0.0 to 10.0.255.255
# There are four subnets which cover the ranges:
#
# 10.0.0.0 - 10.0.63.255 (16384 IP addresses)
# 10.0.64.0 - 10.0.127.255 (16384 IP addresses)
# 10.0.128.0 - 10.0.191.255 (16384 IP addresses)
# 10.0.192.0 - 10.0.255.0 (16384 IP addresses)
#
SubnetConfig:
VPC:
CIDR: '10.0.0.0/16'
PublicOne:
CIDR: '10.0.0.0/18'
PublicTwo:
CIDR: '10.0.64.0/18'
PrivateOne:
CIDR: '10.0.128.0/18'
PrivateTwo:
CIDR: '10.0.192.0/18'
Resources:
# VPC in which containers will be networked.
# It has two public subnets, and two private subnets.
# We distribute the subnets across the first two available subnets
# for the region, for high availability.
VPC:
Type: AWS::EC2::VPC
Properties:
EnableDnsSupport: true
EnableDnsHostnames: true
CidrBlock: !FindInMap ['SubnetConfig', 'VPC', 'CIDR']
# Two public subnets, where containers can have public IP addresses
PublicSubnetOne:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 0
- Fn::GetAZs: {Ref: 'AWS::Region'}
VpcId: !Ref 'VPC'
CidrBlock: !FindInMap ['SubnetConfig', 'PublicOne', 'CIDR']
MapPublicIpOnLaunch: true
PublicSubnetTwo:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 1
- Fn::GetAZs: {Ref: 'AWS::Region'}
VpcId: !Ref 'VPC'
CidrBlock: !FindInMap ['SubnetConfig', 'PublicTwo', 'CIDR']
MapPublicIpOnLaunch: true
# Two private subnets where containers will only have private
# IP addresses, and will only be reachable by other members of the
# VPC
PrivateSubnetOne:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 0
- Fn::GetAZs: {Ref: 'AWS::Region'}
VpcId: !Ref 'VPC'
CidrBlock: !FindInMap ['SubnetConfig', 'PrivateOne', 'CIDR']
PrivateSubnetTwo:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 1
- Fn::GetAZs: {Ref: 'AWS::Region'}
VpcId: !Ref 'VPC'
CidrBlock: !FindInMap ['SubnetConfig', 'PrivateTwo', 'CIDR']
# Setup networking resources for the public subnets. Containers
# in the public subnets have public IP addresses and the routing table
# sends network traffic via the internet gateway.
InternetGateway:
Type: AWS::EC2::InternetGateway
GatewayAttachement:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref 'VPC'
InternetGatewayId: !Ref 'InternetGateway'
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref 'VPC'
PublicRoute:
Type: AWS::EC2::Route
DependsOn: GatewayAttachement
Properties:
RouteTableId: !Ref 'PublicRouteTable'
DestinationCidrBlock: '0.0.0.0/0'
GatewayId: !Ref 'InternetGateway'
PublicSubnetOneRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnetOne
RouteTableId: !Ref PublicRouteTable
PublicSubnetTwoRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnetTwo
RouteTableId: !Ref PublicRouteTable
# Setup networking resources for the private subnets. Containers
# in these subnets have only private IP addresses, and must use a NAT
# gateway to talk to the internet. We launch two NAT gateways, one for
# each private subnet.
NatGatewayOneAttachment:
Type: AWS::EC2::EIP
DependsOn: GatewayAttachement
Properties:
Domain: vpc
NatGatewayTwoAttachment:
Type: AWS::EC2::EIP
DependsOn: GatewayAttachement
Properties:
Domain: vpc
NatGatewayOne:
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt NatGatewayOneAttachment.AllocationId
SubnetId: !Ref PublicSubnetOne
NatGatewayTwo:
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt NatGatewayTwoAttachment.AllocationId
SubnetId: !Ref PublicSubnetTwo
PrivateRouteTableOne:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref 'VPC'
PrivateRouteOne:
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTableOne
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGatewayOne
PrivateRouteTableOneAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTableOne
SubnetId: !Ref PrivateSubnetOne
PrivateRouteTableTwo:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref 'VPC'
PrivateRouteTwo:
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTableTwo
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGatewayTwo
PrivateRouteTableTwoAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTableTwo
SubnetId: !Ref PrivateSubnetTwo
Outputs:
VpcId:
Description: The ID of the VPC that this stack is deployed in
Value: !Ref 'VPC'
PublicSubnetIds:
Description: Comma seperated list of public facing subnets that have
a direct internet connection as long as you assign a public IP
Value: !Sub '${PublicSubnetOne},${PublicSubnetTwo}'
PrivateSubnetIds:
Description: Comma seperated list of private subnets that use a NAT
gateway for internet access.
Value: !Sub '${PrivateSubnetOne},${PrivateSubnetTwo}'
Some things to note:
This pattern VPC has two public subnets, each with 16,384
addresses. These subnets should be used to hosting public facing load balancers, or other similar resources that are intended to accept direct inbound traffic from the internet.
This pattern VPC has two private subnets, each with 16,384
addresses. These subnets should host the underlying EC2 instances and containers that you wish to protect from direct internet access. All of their outbound internet communications will be proxied through two NAT gateways that are hosted in the public subnets.
Amazon Virtual Private Cloud reserves the first four IP addresses and the last IP address in each subnet CIDR block for it's own use. The other 65,516
IP addresses in the VPC are available for your containers and/or EC2 instances.
If you are planning to run an incredibly large workload then keep an eye on the Network Address Usage (NAU) metric and quota for your AWS account. You may need to request an increase to your NAU quota.
Usage
Deploy the template via the AWS CloudFormation console, or with a CLI command like this:
aws cloudformation deploy \
--stack-name big-vpc \
--template-file vpc.yml
The deployed template has Outputs
that you can pass into other stacks:
VpcId
- Many other AWS resources will need to know the ID of the VPC that they are placed in.PublicSubnetIds
- A comma separated list of the subnet ID's that have direct internet access.PrivateSubnetIds
- A comma separates list of the subnet ID's that have internet access via a NAT gateway.
Next Steps
- This template only provisions two subnets. For even greater availability consider adding a third public and private subnet and NAT gateway.
- If your private subnet hosted resources make heavy use of AWS services such as DynamoDB, S3, or other services, then consider adding VPC endpoints for those services. This will remove the need for that traffic to go through the NAT gateway, freeing up it's capacity for other usage, and potentially reducing your networking costs.
- If this VPC still looks too small for your workload then consider splitting it up across multiple smaller VPC's.
- If you do not wish to pay for NAT gateways then consider the low cost VPC for Amazon ECS. Note that this VPC choice does limit your capabilities when running on EC2 instances, although you can still use AWS Fargate capacity without NAT gateway.