A blind spot in my overall monitoring and observability scheme is tracking instance and container logs. Yes, I can grab metrics at the instance and resource level, but reading actual logs leaves me wondering how the application is functioning. Systems may appear green and up by looking at metrics, but customers or readers may not be able to retrieve data on a web page.

Rest assure, I will be fixing this today! I will be utilizing Cloudwatch  Agent, with my instance and containers managed by ECS.  


Overview

When completed, the Cloudwatch Agent will send the logs from the instance and container up to Cloudwatch. Once in Cloudwatch, I will be able to centrally monitor log output without having to log into each instance or connecting to each container.

logs


Setting Up IAM Policy

First up is creating a policy that will allow logs to be written to Cloudwatch. I will attach the policy below to an existing role attached to my ECS instances.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogStreams"
    ],
      "Resource": [
        "*"
    ]
  }
 ]
}

The policy is rather simple and allows for log groups and streams to be created within Cloudwatch. Also it allows for log events to be uploaded and listed.


Installing Cloudwatch Logs Agent on Instances

There are two ways to install the agent on instances. The preferred way is to use Systems Manager Run Command and the other is to manually download the binary and install it.

I will cover both today! First up, will be utilizing Run Command.

Run Command

Log into the AWS Console and navigate to Systems Manager. Once you're in Systems Manager select Run Command on the left hand column.  Within Run Command, search for the AWS-ConfigureAWSPackage command document. This will install the package for the Cloudwatch Agent on our behalf without logging into the instance.

runcom1

Scroll down to the command parameters and ensure the following options are set.

Action: Install

Installation Type: Uninstall and reinstall

Name: AmazonCloudWatchAgent

Version: latest

Under targets, you may choose to use tags, but I went with Choose instances manually. If you do not see your instance, confirm that you have the Systems Manager agent installed and running on your chosen instance.

runcom2

When you're satisfied with the parameters, click on the Run button. It should take a few seconds to complete, but once it is complete, go ahead and SSH into your instance.  

The next section describes manually installing the agent, you can skip to creating the agent configuration file if you used the Systems Manager method to install the agent.

CLI

The second method to install the agent is to use the good old CLI.

You will need to use either wget or curl to download the rpm file for the Cloudwatch Agent. The download link is (replace "region" with the region in which your instance resides in):

https://s3.region.amazonaws.com/amazoncloudwatch-agent-region/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm

US-EAST-2:

https://s3.us-east-2.amazonaws.com/amazoncloudwatch-agent-us-east-2/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm

As soon as you have the rpm file, go ahead and install it with the following command:

rpm -U ./amazon-cloudwatch-agent.rpm

Now that you have the agent installed, you can proceed to creating the agent configuration file.

Creating the Configuration File

To check on the agent's status execute this command: systemctl status amazon-cloudwatch agent. You should notice that the agent is not running. This is because we have not yet configured the Cloudwatch Agent Configuration file.

Navigate to /opt/aws/amazon-cloudwatch-agent/bin directory and you should see a script named amazon-cloudwatch-agent-config-wizard.

[root@host bin]# pwd
/opt/aws/amazon-cloudwatch-agent/bin

[root@host bin]# ll
total 162608
-rwxr-xr-x 1 root root 76277783 Jan 22 17:04 amazon-cloudwatch-agent
-rwxr-xr-x 1 root root 16037067 Jan 22 17:04 amazon-cloudwatch-agent-config-wizard
-rwxr-xr-x 1 root root     9890 Jan 22 17:04 amazon-cloudwatch-agent-ctl
-rwxr-xr-x 1 root root 13908934 Jan 22 17:04 config-downloader
-rwxr-xr-x 1 root root      282 Mar 22 15:49 config.json
-rwxr-xr-x 1 root root 30217083 Jan 22 17:04 config-translator
-rw-r--r-- 1 root root       11 Jan 22 17:04 CWAGENT_VERSION
-rwxr-xr-x 1 root root 30036943 Jan 22 17:04 start-amazon-cloudwatch-agent

Execute that script now, it will walk you through setting up the configuration file and it will automatically start the agent. I chose not to enable metric gathering because I am already gathering it with Prometheus. It will eventually prompt you to monitor specific logs, you can choose to monitor multiple logs, went with just the ECS logs located in /var/log/ecs/ecs-agent.log. I also highly recommend you upload the configuration file to Parameter Store for future use with other instances.

Verifying in Cloudwatch

Now that the agent is up and running, let's checkout if it sent anything to Cloudwatch!

Head into Cloudwatch and click on Log groups. In here you should find your logs from your instance. I named this log group ghost-blog-ecs-agent. In this structure logs are broken down per instance.  

cw1


Using Awslogs Driver with ECS

Now it's time to configure container logs to be sent to Cloudwatch. If you thought installing the Cloudwatch Agent was easy, you'll be pleased to know that the awslogs driver for ECS is just as simple if not more. First, a review of the task definition from my ECS article.

Okay before I get knee deep into setting up the task definition. Let me quickly explain what logs I'm trying to capture. If you're familiar with docker, the docker log command allows you to scan the logs of your container. This is incredibly valuable details when you need to troubleshoot issues with your container. For Ghost it logs details such as what pages are being accessed by users. Alright let's get started by editing the task definition.

Below is the original task definition for my ECS cluster. I will be making an addition to this and will be adding several lines for logConfiguration.

{
    "family": "ghost-blog-task-definition",
    "taskRoleArn": "arn:aws:iam::999999999999:role/myECStaskRole",
    "executionRoleArn": "arn:aws:iam::999999999999:role/ecsTaskExecutionRole",
    "containerDefinitions": [
        {
            "name": "ghost-container",
            "image": "999999999999.dkr.ecr.us-east-2.amazonaws.com/pafableblog:3.0.5",
            "cpu": 0,
            "memory": 200,
            "portMappings": [
                {
                    "containerPort": 2368,
                    "hostPort": 0,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "database__client",
                    "value": "mysql"
                },
                {
                    "name": "db__conn",
                    "value": "gh_01"
                },
                {
                    "name": "db__conn__01",
                    "value": "test-1.us-east-2.rds.amazonaws.com"
                },
                {
                    "name": "NODE_ENV",
                    "value": "production"
                },
                {
                    "name": "url",
                    "value": "http://pafable.com"
                }
            ],
            "mountPoints": [
                {
                    "sourceVolume": "Images",
                    "containerPath": "/var/lib/ghost/content/images",
                    "readOnly": false
                }
            ],
            "secrets": [
                {
                    "name": "db__conn__pw",
                    "valueFrom": "arn:aws:ssm:eu-west-1:999999999999:parameter/pw"
                },
                {
                    "name": "db-conn-ur",
                    "valueFrom": "arn:aws:ssm:eu-west-1:999999999999:parameter/ur"
                }
            ],
            "startTimeout": 2,
            "stopTimeout": 2,
            "disableNetworking": false,
            "privileged": true,
            "readonlyRootFilesystem": false,
            "interactive": false,
            "pseudoTerminal": true
        }
    ],
    "volumes": [
        {
            "name": "Images",
            "host": {
                "sourcePath": "/efs/ghost/content/images"
            }
        }
    ],
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "256",
    "memory": "256",
    "tags": [
        {
            "key": "owner",
            "value": "pafable"
        },
        {
            "key": "env",
            "value": "prod"
        }
    ],
    "ipcMode": "none"
}

Here is the logConfiguration block. Make sure to use awslogs as the log driver. I will also use awslogs-group, awslogs-region, and awslogs-stream-prefix. These options are not necessary except for region, but will help arrange the log streams within Cloudwatch.

"logConfiguration": {
    "logDriver": "awslogs",
    "options": {
        "awslogs-group": "/ecs/ghost-blog-task-definition",
        "awslogs-region": "us-east-2",
        "awslogs-stream-prefix": "ecs"
    }

The full task definition should look something like below.

{
    "family": "ghost-blog-task-definition",
    "taskRoleArn": "arn:aws:iam::999999999999999:role/myECStaskRole",
    "executionRoleArn": "arn:aws:iam::999999999999999:role/ecsTaskExecutionRole",
    "containerDefinitions": [
        {
            "name": "ghost-container",
            "image": "999999999999999.dkr.ecr.us-east-2.amazonaws.com/pafableblog:3.11.0",
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/ghost-blog-task-definition",
                    "awslogs-region": "us-east-2",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "cpu": 0,
            "memory": 200,
            "portMappings": [
                {
                    "containerPort": 2368,
                    "hostPort": 0,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "database__client",
                    "value": "mysql"
                },
                {
                    "name": "database__connection__database",
                    "value": "ghost_container_001"
                },
                {
                    "name": "database__connection__host",
                    "value": "test-db09-instance-1.quaedasfsqwe.us-east-2.rds.amazonaws.com"
                },
                {
                    "name": "NODE_ENV",
                    "value": "production"
                },
                {
                    "name": "url",
                    "value": "http://pafable.com"
                }
            ],
            "mountPoints": [
                {
                    "sourceVolume": "Images",
                    "containerPath": "/var/lib/ghost/content/images",
                    "readOnly": false
                }
            ],
            "secrets": [
                {
                    "name": "database__connection__password",
                    "valueFrom": "arn:aws:ssm:us-east-2:999999999999999:parameter/pw"
                },
                {
                    "name": "database__connection__user",
                    "valueFrom": "arn:aws:ssm:us-east-2:999999999999999:parameter/ur"
                }
            ],
            "startTimeout": 2,
            "stopTimeout": 2,
            "disableNetworking": false,
            "privileged": true,
            "readonlyRootFilesystem": false,
            "interactive": false,
            "pseudoTerminal": true
        }
    ],
    "volumes": [
        {
            "name": "Images",
            "host": {
                "sourcePath": "/efs/ghost/content/images"
            }
        }
    ],
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "256",
    "memory": "256",
    "tags": [
        {
            "key": "owner",
            "value": "pafable"
        },
        {
            "key": "env",
            "value": "prod"
        }
    ],
    "ipcMode": "none"
}

When your satisfied with the changes you can either use the aws cli to push the new task definition to ECS or commit and push them to your code repository. To push the task definition to ECS directly use the command below.

aws ecs register-task-definition --cli-input-json file://task_definition_filename.json

I will be pushing my code to my repository in CodeCommit because I have an automated pipeline that builds my container images and updates ECS with the new task definition.


Testing Time

Alright now comes the nerve wracking part of the process...testing! I have pushed the changes to code to my repository so now I can release the change. Click on the "Release change" button and let it rip!

newtd

Cool if you didn't get any failures in your stages, your ghost containers should now be using a new task definition.

Head into Cloudwatch and go to "Log groups". The first log group I'll check is the instance log group. I named this ghost-blog-ecs-agent because I wanted to track the ecs-agent.log on the ECS instance.

ecsinstance

Awesome! It is there and reporting the log entries from the instance as expected!

td

Now search for the logs coming from the containers. I named these ecs/ghost-container/{TASK_ID}.

container

Perfect both tasks are reporting in as expected.


Conclusion

That wraps up this article! I showed you all today how to gain insight on how the application is functioning through proper log management. Using Cloudwatch Agents and awslogs driver allows your organization to have a single pane of glass to track and correlate issues within Cloudwatch.

You save yourself from having to setup lifecycle hooks to gather logs or manually gather them yourself. Believe me, from past experience, manually gathering logs from multiple hosts is not a fun task and incredibly time consuming. Leveraging Cloudwatch allows you to read logs in near real-time.