Okay I confess I got lazy and wanted to push out my previous AWS migration post a fast as possible. Well in my haste, I completely neglected implementing redundancy. If I lose the ECS instance or the ghost blog container running on the instance, this site will be unreachable until I can fix the issue.

Not ideal if this site is a mission critical business component. This afternoon I'll be correcting that and will show you all how to tweak the application load balancer to load balance traffic between two ECS instances and six containers.


As mentioned above, I will be adding another ECS instance to my cluster. Each instance will run three ghost containers each. In case I lose an instance or container, there will be enough redundancy in place to keep my site running smooth.

Let's review, below is the previous diagram of my setup:

I only have a single ECS instance and I never factored using multiple containers to provide redundancy at the container layer. This was due to the limitation of only one container being able to use a port.


The new diagram below configures the ECS segment to be more robust with not only redundant ECS instances but also multiple containers to handle requests. With this setup, I can rest assure that if I lose a container or instance my blog will remain accessible.


Creating a New Cluster

First step is to create a new cluster. This new cluster will have two instances instead of one, but the rest of the settings will be the same (VPC, subnets, and security group).


Updating the Task Definition

Open up the latest task definition for your ghost blog and create a new revision. In this revision, scroll down to the port mappings and set the port to 0. This will tell ECS to dynamically assign a port to the container and map it to port 2368 inside of the container.


Unlike previously I was statically mapping port 80 to 2368, no matter how many tasks I run, only one container will be created because only single container can use port 80 at any given time. Definitely not ideal for scalability and redundancy!

Creating a New Service

Once the new task definition is in place, it's time to create a new service! Navigate to your cluster in ECS and create a new service.

In the new page that appears, select your new revision within the "Task Definition" section. After that, in the "Number of tasks" put six. I want to have six containers running in my cluster.

Based on the memory and cpu requirements mentioned in the task definition, ECS will provision the containers evenly between the two ECS instances. At the end of this demo, three containers will be housed on each instance at any given time.

Keep the rest of the settings default and click on the "Next step" button on the bottom right corner.


In the next page (Configure network), I will configure the load balancer.

Select "Application Load Balancer" for the load balancer type and use the same service IAM role and load balancer you first created in the previous cluster. In my case I am using ecsServiceRole and ghost-elb respectively.

Notice how it says "Allows containers to use dynamic host port mapping..." in the description of the application load balancer. This is important because it will do the magic of load balancing traffic between containers and their random ports without this feature I'll be limited to using static ports.


Next click on the "Add to load balancer" button, in the page that appears, change the "Target group name" and select the target group for your ECS instance. Mine is named ghost-tg. This should be the only setting you need to edit within the container to load balance section, the rest will be automatically filled.



Give ECS a few seconds to provision the tasks (containers) after creating the service. Next verify all six tasks are up and running nicely as well as your site's front-end is accessible.


Hooray it's working fine! Time to test the redundancy and resilience!

Go into the cluster's main page and open up the tasks tab. Select one of the tasks and stop it.


If you configured your service properly you should see the stopped task. If you remember tasks are the actual containers running on the ECS instances. Check the site's frontend again and confirm it is still accessible by using a browser. You should be able to access it just fine. Refresh a few times to verify you don't get a dreaded 503 error.  

Next check if another container has taken the place of the stopped container and is currently running smoothly. If you see your replacement container running, now you can take a big sigh of relief! Your site is now highly redundant and can quickly recover if it runs into an issue.