Create an Amazon ECS Cluster with Terraform

Arvind Soni profile picture
Arvind Soni
WW Head of GTM for ECS at AWS

About

Terraform by HashiCorp is an infrastructure automation tool that can be used to provision and manage resources on AWS.

This pattern will demonstrate how to use the community terraform-aws-modules to deploy a VPC, and an ECS cluster. This will form the core infrastructure that can be used to deploy containerized services using Amazon ECS.

Dependencies

  • Terraform (tested version v1.2.5 on darwin_amd64)
  • Git (tested version 2.27.0)
  • AWS CLI
  • AWS test account with administrator role access
  • Configure AWS credentials

Architecture

This pattern will create the following AWS resources:

Public subnetVPCAvailability Zone 1Private subnetInternet gatewayNAT gatewayPublic subnetAvailability Zone 2Private subnetPublic subnetAvailability Zone 3Private subnetOutbound internet

  • Networking
    • VPC
      • 3 public subnets, 1 per AZ. If a region has less than 3 AZs it will create same number of public subnets as AZs.
      • 3 private subnets, 1 per AZ. If a region has less than 3 AZs it will create same number of private subnets as AZs.
      • 1 NAT Gateway (see warning below)
      • 1 Internet Gateway
      • Associated Route Tables
  • 1 ECS Cluster with AWS CloudWatch Container Insights enabled.
  • Task execution IAM role
  • CloudWatch log groups
  • CloudMap service discovery namespace default

WARNING

This pattern deploys a single shared NAT gateway in a single AZ. This saves cost in most cases but has the following downsides:

  • The shared NAT gateway is a single point of failure if that AZ has an outage.
  • In cases where you make heavy use of the NAT gateway because your application makes many outbound connections to the public internet, you may acrue additional cross AZ charges because resources in the other two AZ's are generating cross AZ traffic to a NAT gateway hosted in a different AZ.

In these cases consider adding a NAT gateway for each AZ. This will be a higher baseline cost but safer and will limit cross AZ traffic.

Define the infrastructure

Download the following three files that define the core infrastructure:

  • main.tf
  • outputs.tf
  • versions.tf
File: main.tfLanguage: tf
provider "aws" {
  region = local.region
}

data "aws_availability_zones" "available" {}
data "aws_caller_identity" "current" {}

locals {
  name   = "core-infra"
  region = "us-east-2"

  vpc_cidr = "10.0.0.0/16"
  azs      = slice(data.aws_availability_zones.available.names, 0, 3)

  tags = {
    Blueprint  = local.name
    GithubRepo = "github.com/aws-ia/ecs-blueprints"
  }
}

################################################################################
# ECS Blueprint
################################################################################

module "ecs" {
  source  = "terraform-aws-modules/ecs/aws"
  version = "~> 5.0"

  cluster_name = local.name

  cluster_service_connect_defaults = {
    namespace = aws_service_discovery_private_dns_namespace.this.arn
  }

  fargate_capacity_providers = {
    FARGATE      = {}
    FARGATE_SPOT = {}
  }

  # Shared task execution role
  create_task_exec_iam_role = true
  # Allow read access to all SSM params in current account for demo
  task_exec_ssm_param_arns = ["arn:aws:ssm:${local.region}:${data.aws_caller_identity.current.account_id}:parameter/*"]
  # Allow read access to all secrets in current account for demo
  task_exec_secret_arns = ["arn:aws:secretsmanager:${local.region}:${data.aws_caller_identity.current.account_id}:secret:*"]

  tags = local.tags
}

################################################################################
# Supporting Resources
################################################################################

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 3.0"

  name = local.name
  cidr = local.vpc_cidr

  azs             = local.azs
  public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k)]
  private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 10)]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  # Manage so we can name
  manage_default_network_acl    = true
  default_network_acl_tags      = { Name = "${local.name}-default" }
  manage_default_route_table    = true
  default_route_table_tags      = { Name = "${local.name}-default" }
  manage_default_security_group = true
  default_security_group_tags   = { Name = "${local.name}-default" }

  tags = local.tags
}

################################################################################
# Service discovery namespaces
################################################################################

resource "aws_service_discovery_private_dns_namespace" "this" {
  name        = "default.${local.name}.local"
  description = "Service discovery namespace.clustername.local"
  vpc         = module.vpc.vpc_id

  tags = local.tags
}

You should have three files:

  • main.tf - Main file that defines the core infrastructure to create
  • outputs.tf - A list of output variables that will be passed to other Terraform modules you may wish to deploy
  • versions.tf - A definition of the underlying requirements for this module.

TIP

For a production environment it is highly recommended to create a backend.tf file that configures S3 for state storage and DynamoBD for resource locking, or Terraform Cloud for state management.

The default setup will only track Terraform state locally, and if you lose the state files Terraform will no longer be able to managed the created infrastructure, and you will have to manually track down and delete every resource that Terraform had created.

Deploy the Terraform definition

First we need to download all the dependency modules (defined in versions.tf) that this pattern relies on:

Language: shell
terraform init

Next we can review the deployment plan, and then deploy it:

Language: shell
terraform plan
terraform apply --auto-approve

When the Terraform apply is complete you will see a list of outputs similar to this:

Language: txt
Outputs:

ecs_cluster_id = "arn:aws:ecs:us-west-2:209640446841:cluster/files"
ecs_cluster_name = "files"
ecs_task_execution_role_arn = "arn:aws:iam::209640446841:role/files-20230518161249649200000002"
ecs_task_execution_role_name = "files-20230518161249649200000002"
private_subnets = [
  "subnet-0dcca4267e8b9894c",
  "subnet-047d27c28c7891a90",
  "subnet-0ab5512d135ce43cb",
]
private_subnets_cidr_blocks = tolist([
  "10.0.10.0/24",
  "10.0.11.0/24",
  "10.0.12.0/24",
])
public_subnets = [
  "subnet-0d976733da1d6dd08",
  "subnet-013db9ca920c24554",
  "subnet-0b9e7e3e45bb6a743",
]
service_discovery_namespaces = {
  "arn" = "arn:aws:servicediscovery:us-west-2:209640446841:namespace/ns-aliplookapjwmjgo"
  "description" = "Service discovery namespace.clustername.local"
  "hosted_zone" = "Z0609025HGBC4TU4U285"
  "id" = "ns-aliplookapjwmjgo"
  "name" = "default.files.local"
  "tags" = tomap({
    "Blueprint" = "files"
    "GithubRepo" = "github.com/aws-ia/ecs-blueprints"
  })
  "tags_all" = tomap({
    "Blueprint" = "files"
    "GithubRepo" = "github.com/aws-ia/ecs-blueprints"
  })
  "vpc" = "vpc-0c7ae3da22686c9cd"
}
vpc_id = "vpc-0c7ae3da22686c9cd"

Tear it Down

You can use the following command to teardown the infrastructure that was created.

Language: shell
terraform destroy

See Also