So for the past three weeks while studying for the AWS Devops Professional certification, I've been punch-drunk with setting up automated pipelines. I've already automated my container image creation process now the next piece is to automate the AMI instance creation!

Why do I need to automate this when I'm using ECS? Well the main reason is because AWS updates the ECS agent frequently and one of the best ways to update the agent is by terminating the instance and having it recreated by the auto-scaling group. This would be ideal, however the default ECS Amazon Machine Image (AMI) is missing some stuff that I require, for example node-exporter for Prometheus to consume metrics, ssm-agent for automation and EFS mount.

The solution is to bake my very own ECS AMI that contains all the dependencies I need and to create it routinely. I will bake the new AMI using Hashicorp's Packer tool within CodeBuild! Packer will create an AMI based on the default ECS AMI download and configure the dependencies I need. CodePipeline will provide the "glue" or the automation aspect.

I will be basing my pipeline on this AWS blog post. I'll make edits to the code which will be used in both Lambda functions.


Overview

The new feature in this pipeline will incorporate Slack in the approval stage or as the "cool kids" like to call it – ChatOps! Instead of sending an approval email the message will be relayed to Slack via Lambda and the response will be captured by AWS API Gateway and invoke a second Lambda function that will send the approval to the pipeline.

The flow will go as follows:

  1. CloudWatch will be configured with a cron to trigger once a week.
  2. CodePipeline will be triggered by CloudWatch to execute the build.
  3. The code from CodeCommit will be collected and sent to the next stage.
  4. Before the code can be built in the CodeBuild stage, it needs to go through an approval stage. In this stage a specific group or individual can approve or reject a build from continuing. This acts as a failsafe to ensure only desired builds are pushed out into production.
  5. The approval stage triggers the first lambda function that will deliver a message to a specified Slack channel.
  6. Slack will send a message using a POST method to a specified API gateway endpoint.
  7. The API gateway will accept the message from Slack and invoke a second lambda function.
  8. The second Lambda function will send the approval response to CodePipeline which will signal the CodeBuild stage to build the AMI.
  9. The last and final stage is CodeBuild which will utilize Packer and build a new AMI with all of my specifications.

diagram


The Code

Before creating the pipeline make sure you have an ECR repository configured with all of your code loaded into it. Below are the four files needed by Packer: a shell script for Packer to execute, CodeBuild, the json file telling Packer what to do, and node-exporter service.

Packer file:

{
  "variables": {
    "aws_access_key": "{{env `aws_access_key`}}",
    "aws_secret_key": "{{env `aws_secret_key`}}",
    "vpc_id": "{{env `vpc_id`}}",
    "subnet_id": "{{env `subnet_id`}}"
  },
  "builders": [
    {
      "access_key": "{{user `aws_access_key`}}",
      "secret_key": "{{user `aws_secret_key`}}",
      "ami_name": "pafable-ecs-ami-2020-02-02-{{timestamp}}",
      "instance_type": "t2.micro",
      "region": "us-east-2",
      "associate_public_ip_address": false,
      "ami_regions": ["us-east-2"],
      "ssh_username": "ec2-user",
      "type": "amazon-ebs",
      "vpc_id": "{{user `vpc_id`}}",
      "subnet_id": "{{user `subnet_id`}}",
      "ami_description": "AMI made by pafable; consisting of ECS, ssm-agent, node-exporter, efs",
      "security_group_ids": [
          "sg-0cf4a7ff13a3409a8", 
          "sg-0bc8e9c21c36eb55d", 
          "sg-09f80cdbe2b8a3c77", 
          "sg-00003655baab43f72",
          "sg-01883efb80a16926f"
        ],
      "tags": {
        "owner": "pafable",
        "env": "prod",
        "version": "2020.02.02"
      },
      "source_ami_filter": {
        "filters": {
          "virtualization-type": "hvm",
          "name": "amzn2-ami-ecs-hvm-2.0.*-x86_64-ebs",
          "root-device-type": "ebs"
        },
        "owners": ["591542846629"],
        "most_recent": true
      }
    }
  ],
  "provisioners": [
    {
      "type": "file",
      "source": "node-exporter.service",
      "destination": "/tmp/node-exporter.service"
    },
    {
      "script": "goldenecsami.sh",
      "type": "shell"
    }
  ]
}
goldenami.json

Script for packer to execute:

# Download the latest updates
sudo yum update -y

# Install EFS and wget. Create efs directory
sudo yum install -y amazon-efs-utils wget
sudo mkdir /efs
df -h 

# Install SSM-Agent. Start and enable amazon-ssm-agent
sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
sudo systemctl start amazon-ssm-agent 
sudo systemctl enable amazon-ssm-agent

# Install node exporter
sudo wget https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.gz
sudo tar xvf node_exporter-0.18.1.linux-amd64.tar.gz
sudo useradd --no-create-home --shell /bin/false node-exporter-user
sudo mv node_exporter-0.18.1.linux-amd64 /usr/local/bin
sudo mv /tmp/node-exporter.service /etc/systemd/system
sudo chown -R node-exporter-user: /usr/local/bin/node_exporter-0.18.1.linux-amd64
sudo systemctl daemon-reload
sudo systemctl start node-exporter
sudo systemctl enable node-exporter
goldenami.sh

CodeBuild file:

version: 0.2

phases:
  install:
    runtime-versions:
        python: 3.8

  pre_build:
    commands:
        - apt install wget unzip -y
        - wget https://releases.hashicorp.com/packer/1.5.1/packer_1.5.1_linux_amd64.zip 
        - unzip packer_1.5.1_linux_amd64.zip
        - mv packer /usr/bin
        - ls -lh /usr/bin | grep packer

  build:
    commands:
        - packer version
        - echo ${AWS_SECRET}
        - echo ${AWS_ACCESS}
        - packer build -var "aws_access_key=${ACCESS_KEY}" -var "aws_secret_key=${SECRET_KEY}" -var "subnet_id=subnet-06dcbfad104b4a783" -var "vpc_id=vpc-008841edcbac65ca1" goldenami.json
        
  post_build:
    commands:
        - echo "foo bar"
buildspec.yml

Node-exporter service:

[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node-exporter-user
Group=node-exporter-user
Type=simple
ExecStart=/usr/local/bin/node_exporter-0.18.1.linux-amd64/node_exporter

[Install]
WantedBy=multi-user.target
node-exporter.service

Building the Pipeline

Go ahead and create a new pipeline. I'll name my pipeline pafable-ecs-ami and select the default options for it.

CrPipeline1

In the source stage, select AWS CodeCommit as the source provider. Select the repository you created earlier and make sure to select master for the branch name.

source

In the build stage select CodeBuild as the provider and the region you wish to deploy to. Next click on the "Create project" button, I'll call my project pafable-ecs-ami.

pipeline5

In the environment section click on "Managed image" and select Ubuntu for the operating system. For the time being, Amazon Linux cannot run Packer. You can leave the rest as default.

pipeline5

On the buildspec section choose "Use a buildspec file".

pipeline7

Along with the settings mentioned above, I'll be utilizing environment variables. However these are going to be sensitive credentials such as AWS access and secret keys.

I'll be storing the access and secret keys in Parameter Store as secure strings. My access and secret keys will be called packer_access_key and packer_secret_key respectively as they will be used by Packer only.

I'll give these environment variables the name AWS_SECRET and AWS_ACCESS. Next I will enter the ARN of the credentials in the value box and for type I'll select "Parameter".  Now when I call AWS_SECRET and AWS_ACCESS within the buildspec file it will be replaced by the credentials in Parameter Store. This ensures my secrets are kept secure and not written in my code!

pipe4

You can skip the CodeDeploy stage and finish your pipeline, I won't be using it in my pipeline.

Your pipeline should look something like below:

pipeline8


Creating the First Lambda Function

The first Lambda function will be triggered by an SNS topic (still needs to be created) that will send a message to my Slack channel. As mentioned in the beginning, the python code for Lambda can be found here.

I made some modifications to the code and I'll walk you through it now. Let's start with the libraries needed for this; the libraries are os, json, logging, urllib.parse, boto3, and b64decode. The next section are the environment variables, I tweaked this section slightly so that it will pull the Slack webhook and channel from Parameter Store.

In the logger the logging is set to INFO. After the logging section comes the intersting bits! The first function (lambda_handler) is rquired by Lambda, it will take an input sent from SNS in JSON format. I'll then parse through the JSON and look for the Message and Subject. Hopefully your dictonary parsing skills are on point!

Next it will parse and grab the token and codepipeline_name. Quickly following that is the Slack mesage. This will be sent to my specified channel on Slack, it will ask for an input and the user may click on either "Deploy" or "Reject".  Finally the message is packaged up into a request req and sent off into the "wild blue yonder" of Slack land.

# This function is invoked via SNS when the CodePipeline manual approval action starts.
# It will take the details from this approval notification and sent an interactive message to Slack that allows users to approve or cancel the deployment.

import os
import json
import logging
import urllib.parse
import boto3

from base64 import b64decode
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError

# This is passed as a plain-text environment variable for ease of demonstration.
# Consider encrypting the value with KMS or use an encrypted parameter in Parameter Store for production deployments.
SLACK_WEBHOOK_URL = os.environ['SLACK_WEBHOOK_URL']
SLACK_CHANNEL = os.environ['SLACK_CHANNEL']

# This will connect to ssm and pull environment variabless from there.
client = boto3.client('ssm')
SLACK_WEBHOOK_URL = client.get_parameter(Name=SLACK_WEBHOOK_URL, WithDecryption=True)
SLACK_CHANNEL = client.get_parameter(Name=SLACK_CHANNEL, WithDecryption=True)

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    print("Received event: " + json.dumps(event, indent=2))
    message = event["Records"][0]["Sns"]["Message"]
    subject = event["Records"][0]["Sns"]["Subject"]
    
    URL = SLACK_WEBHOOK_URL['Parameter']['Value']
    CHANNEL = SLACK_CHANNEL['Parameter']['Value']
    
    data = json.loads(message) 
    token = data["approval"]["token"]
    codepipeline_name = data["approval"]["pipelineName"]
    
    slack_message = {
        "channel": CHANNEL,
        "text": subject,
        "attachments": [
            {
                "text": "Deploy your build to production",
                "fallback": "You are unable to deploy this build",
                "callback_id": "wopr_game",
                "color": "#FFAC31",
                "attachment_type": "default",
                "actions": [
                    {
                        "name": "deployment",
                        "text": "Deploy",
                        "style": "danger",
                        "type": "button",
                        "value": json.dumps({"approve": True, "codePipelineToken": token, "codePipelineName": codepipeline_name}),
                        "confirm": {
                            "title": "Aye mate, are you sure?",
                            "text": "This will deploy the build to production",
                            "ok_text": "Yes",
                            "dismiss_text": "No"
                        }
                    },
                    {
                        "name": "deployment",
                        "text": "No",
                        "type": "button",
                        "value": json.dumps({"approve": False, "codePipelineToken": token, "codePipelineName": codepipeline_name})
                    }  
                ]
            }
        ]
    }

    req = Request(URL, json.dumps(slack_message).encode('utf-8'))

    response = urlopen(req)
    response.read()
    
    return None
1st lambda function

Before I close and save this Lambda code, I need to configure the environment variables and then make sure the entries for the Slack webhook and channel are in Parameter Store.

At the bottom of Lambda, I'll configure the environment variables like so:

pipeline9

In Parameter Store, I'll create two securestring entries with the names SLACK_CHANNEL and SLACK_WEBHOOK_URL.


Configuring SNS

Navigate to the SNS service and create a new topic. In the new topic create a new subscription. Select AWS Lambda as the Protocol and select your first Lambda function as the endpoint.

pipeline10

Quickly hop back into your Lambda function and make sure the trigger is set to your newly created SNS topic.


Creating the Second Lambda Function

This second Lambda function will be triggered by API Gateway (will be created in the next section). Like before I'll be importing json, os, and boto3 libraries with the addition of parse_qs.

Just like before I'll be utilizing Parameter Store to supply the value for a secret. In this case that secret is SLACK_VERIFICATION_TOKEN. In the lambda_handler function, parse_qs will convert application/x-www-form-urlencoded into a dictionary which will then be parsed in json.

After parsing the data, if the SLACK_VERIFICATION_TOKEN matches the token within the parsed data (payload) it willl call the send_slack_message method and return a status code of 200 and a message will be seen in Slack saying "Aye understood, captain!".

The send_slack_message will use boto3 to interact with my codepipeline build and signal it that I approved it and it can continue with the CodeBuild stage!

# This function is triggered via API Gateway when a user acts on the Slack interactive message sent by approval_requester.py.

from urllib.parse import parse_qs
import json
import os
import boto3

SLACK_VERIFICATION_TOKEN = os.environ['SLACK_VERIFICATION_TOKEN']
client = boto3.client('ssm')
SLACK_VERIFICATION_TOKEN = client.get_parameter(Name=SLACK_VERIFICATION_TOKEN, WithDecryption=True)

#Triggered by API Gateway
#It kicks off a particular CodePipeline project
def lambda_handler(event, context):
	print("Received event: " + json.dumps(event, indent=2))
	body = parse_qs(event['body'])
	payload = json.loads(body['payload'][0])
	TOKEN = SLACK_VERIFICATION_TOKEN['Parameter']['Value']

	# Validate Slack token
	if TOKEN == payload['token']:
		send_slack_message(json.loads(payload['actions'][0]['value']))
		
		# This will replace the interactive message with a simple text response.
		# You can implement a more complex message update if you would like.
		return  {
			"isBase64Encoded": "false",
			"statusCode": 200,
			"body": "{\"text\": \"  Aye understood, captain!\"}"
		}
	else:
		return  {
			"isBase64Encoded": "false",
			"statusCode": 403,
			"body": "{\"error\": \"This request does not include a vailid verification token.\"}"
		}


def send_slack_message(action_details):
	codepipeline_status = "Approved" if action_details["approve"] else "Rejected"
	codepipeline_name = action_details["codePipelineName"]
	token = action_details["codePipelineToken"] 

	client = boto3.client('codepipeline')
	response_approval = client.put_approval_result(
							pipelineName=codepipeline_name,
							stageName='Approval',
							actionName='ApprovalOrDeny',
							result={'summary':'','status':codepipeline_status},
							token=token)
	print(response_approval)
2nd lambda function

Just like in the first function, I'll create the environment variable. This second function will only need one.

pipeline11

NOTE: If you're troubleshooting why your Lambda function keeps on failing, take a look at CloudWatch Logs. There should be a log group for your function. Be very careful of printing credentials in your functions because they will appear in the logs!

Configuring the API Gateway

Time to configure something completely new to me – API Gateway. This is required because there is no way to invoke the second Lambda function without API Gateway. In AWS you can't directly invoke a Lambda function without another service acting as the trigger.

Open the API Gateway service console and click on the "Create API" button. In the following page that appears select the first option for "REST API". Note the descriptions, the other REST API option is restricted to the VPC only – you do not want to select this!

pipeline12

Select REST as the protocol,  create a New API, you can give your API a name and description, and lastly select Regional for the endpoint type.

pipeline13

Click on the "Actions" button and create a new resource. Give the resource a name and make sure to check off the box for Enable API Gateway CORS.

pipeline14

Once your resource has been created, click on the "Actions" button again and create a method. In the drop down option that appears, select POST.

Integration type should be Lambda Function and for the "Lambda Function" select the name of the second lambda function. Mine is called approved_by_slack. You can leave the rest of the option default.

pipeline15

After creating the method, you should have a diagram like below. It will show you what will happen an what services will be invoked when a POST request is received. As you can see in mine, when a request is received my second Lambda function is invoked and a response is returned to the client.

pipeline16

Now you can deploy the API. Click on the "Actions" button once more and select "Deploy API". Select [New Stage] for deployment stage and the rest can be what ever you want.

pipeline17

When the API is deployed you will receive a URL that can be invoked.

pipeline18


Creating the Slackbot

Now that the heavy lifting in AWS is out of the way, let's configure Slack. First you will need to make sure you're logged into Slack via web ui. Next navigate over to Slack's API page and go to your apps.

Click on the green button to create a new Slack App. Give it a name and select your Slack Workspace.

pipeline19

In the basic information page select "Incoming Webhooks".

pipeline20

Navigate down to the "Incoming Webhooks" section and confirm that it's on. Also on the same page, click on the Add New Webhook to Workspace button. This will give you the SLACK_WEBHOOK_URL that you will need to upload to Parameter Store.

pipeline21

Head on over to the "Interactive Components" section and remember the invoke URL from API Gateway? You can copy and paste it into the "Request URL" box. Also make sure this is set to On.

pipeline22

Once you have that in place, go back into the "Basic Information" and copy the "Signing Secret". Make sure NOT to copy the deprecated "Verification Token"! I made references in my code to use the verification token, but according to Slack using it is no longer secure and best practice.

The sigining secret functions the same way as the verification token so there is no code change needed.


Time to Test!

Now it's time for my favorite part executing the pipeline and watching everything burn work on the first try. Go into CodePipeline and select your AMI pipeline, next click on the "Release change" button to kick off the build.

Wait a few seconds and you should see a message pop up in your Slack channel. It will ask you if you want to Deploy your new AMI.

pipeline24

If you click on the "Deploy" button a pop-up window will appear confirming your deploy... in case you accidentally clicked on the deploy button for any reason.  

pipeline25

After approving the deploy the text box will change to the confirmation message. This prevents other users from confusing previous unapproved messages.

pipeline26

Success! What a huge relief, it worked flawlessly. The build alerted me of an approval and all I had to do was click to deploy.

pipeline23


Wrapping Up

Now it's time to fully automate this pipeline. I don't want to start this build by clicking on the "Release change" button or by pushing an update to the git repository in CodeCommit. I simply want it to run once a week and start the process.

Open up CloudWatch and navigate to Rules and select the rule for the pipeline.

pipeline27

In the next page click on the "Actions" button on the top right and select edit. In the next page change the event source from "Event Pattern" to "Schedule". In the cron expression box add 0 0 ? * 7 *. This expression will trigger the pipeline every Saturday at 12AM GMT or 7PM EST.

pipeline28

That is it! Now the pipeline can initiate on it's own and check with me if I want it to proceed with the deploy or not. Those of you who are lazy like me and do not want to log into the AWS console to approve builds, this is a great alternative to just receiving emails from SNS. There are so many applications you can do with this.

You can even take it a step further from here and literally deploy to ECS or even create new stacks with CloudFormation. This way your entire AWS estate stays fresh.

However before I let you all go off and wreak havoc on your Slack channels, I need to stress the importance of this again, DO NOT print out your environment variables if it is a secret in your Lambda functions! If you did, please delete your logs in CloudWatch immediately and change the secret in Parameter Store.