Meet Badger, the Slack bot

As soon as Curated’s stack started to evolve, we realized we needed to expand on the visibility of our deployments. Even with a small engineering team, we didn’t have a consistent way of notifying engineers that their changes were being deployed into production.

Not a problem when we’re hacking away in a loft apartment; very big problem as we continue to scale.

It starts with GitLab

GitLab was our first choice for extensible, customizable software that we could use to manage our continuous integration and deployment pipelines. It offers the proper hooks and flexibility to work with our complicated Kubernetes system and the proper APIs to remotely communicate with it.

Without going into extreme detail of our GitLab build and deployment pipeline, our code goes through a series of stages that contain jobs to build the application, run automated unit and integration tests, and then finally deploy to staging and production environments. The first stages are automated in that as one group of jobs completes, the second stage is triggered. Most jobs in each stage run in parallel, but a failed job in any stage prevents the progression into the subsequent tiers.

The final automated step is deploying to staging. All commits to master will be continuously deployed to our staging environment as long as they successfully build and pass all of our automated tests. From here, any member of our engineering team can promote a deployable application to production with a single click of a button. Herein lies the problem that we sought to solve. We wanted to create low-friction, high visibility sign off on deployments.

Enter Slack Bot

Badger, the bot 

Enter Slack Bot, also known as Badger, one of the many animal mascots at Curated. We choose to manage deployments via Slack as this was the most active place for our engineering team. Before we get into the details, the main question is why not just use something off the shelf? The answer is security and flexibility. There are plenty of solutions to manage the deployment lifecycle of applications via Slack, but none that offered the peace of mind of a fully hosted application or the flexibility we’d need to add custom features.

Slack’s slash command integrations are flexible and easy to work with (most of the time). They offer rich building block tools to style messages and commands, and their interactive action layer makes it simple to communicate with niche integrations like ephemeral messages.

Our Slack Bot is a Node app built on express. It works off a series of listeners for each individual slash command and configurations for each of our deployable apps. Since each command runs its own code path, it makes the app extensible if we want to add in other commands or interactions down the line. To start with, we built a deployment summary command.

The first iteration of the deployment worked via a series of promises chained from GitLab’s API. There are a couple nuances to how we summarize the diff, but these are the main steps.

  1. Via a slash command, issue deployment to our express app
  2. Fetch the latest commit on staging for the passed in app
  3. Fetch the latest commit on production
  4. Summarize the diff in master pipeline builds between these commit SHAs
  5. Display each built pipeline to the user in our engineering channel

This is where it stopped. We surface a link back to the user to view the terminal pipeline on GitLab, they can click deploy from there and away we go. The problem here is there is no feedback mechanism back to the Slack channel that a deployment was actually initiated. Not catastrophic, but certainly becomes a problem if there are pressing releases. It also was a bit spammy if you just wanted to check if something was “ready” to deploy, in that it had made it to staging.

This brought on the addition of interactive functionality. The interaction layer in Slack is a little thin in terms of data that can be sent back and forth, but it does offer you the option of passing stringified JSON, which turned out to be more than enough.

To trigger a deployment, a user enters the slash command:
/deploy application-name

We appended two CTAs to the bottom of the deployment summary via Slack’s block kit builder. The deploy action is a double opt-in command that updates the original message indicating that a deployment has been initiated, cancelled, or possibly expired. We expire deployments after 10 minutes to ensure you don’t try to deploy older code by clicking an older message. When you click ‘Deploy’ on an eligible message, the API calls to GitLab with the matching SHA of the terminal commit and triggers the deployment. The SHA is passed via a key on the 'Deploy' action.

This now makes the balance of the channel aware that the deployment has been acted on and there was a successful GitLab response.

Still, we found the messages to come off a bit spammy. So, how can we check what’s staged for deployment without broadcasting this to the channel? Slack’s ephemeral messages.

To preview a deployment so only you would see it, the user enters:
/preview application-name

Ephemeral messages are only visible to you and are displayed in any channel or conversation you trigger them from (threads not included, Slack limitation). This gives you the same summary you’d see for a deployment, but with CTAs to cancel or broadcast to the engineering channel. No more spam. Internal reflection only.

Signing Requests

One of the main reasons I mentioned our interest in having a home grown solution was security. If we wanted an off the shelf solution, all of the options required read access to our internal slack channels and GitLab account, which we didn’t feel comfortable with. However, since we built the solution ourselves, this means we need to ensure our own security.

We don't store any specific API keys or OAuth tokens within our code base (thanks Vault), so for starters all of our process variables are injected at service start when our Slack bot deploys. This is a big first step in preventing any wayward keys from hanging out in the source code.

We also make sure every request coming into the command listener is signed with the proper headers, timestamp and that the signed secret matches our locally constructed one. Slack goes into detail about best practices in their documentation.

Formatting signatures for slash commands

This is an implementation detail we came across while constructing the the signature requests. Make sure to URLEncode the request body in an RFC1738 format otherwise the signatures will not match for slash commands with spaces. The difference being RFC1738 constructs the encoded+strings+like+this versus the default of the%20string. Failure to do this will otherwise result in a signature mismatch and an ominous dispatch_failed error,  Slack's default catch-all error message. From what I could find, this is not documented as of the date of this post.

Most libraries have something built in to handle this:

import qs from 'qs'

qs.stringify(req.body, { format: 'RFC1738' })

The evolution

As our dependency on our new tool grew, we began to think how we can expand on this to become a full on lifecycle tool for our applications (and beyond).

Rollbacks

Rolling back works like the deployment in reverse, grabbing the last three commits deployed to production and displaying them in the channel with individual CTAs. Useful for a quick revert of the latest deployment.

Locking / Unlocking

We added a production grade Redis instance to our Slack bot deployment. This is a low cost solution that allows us to have simple storage on things like, is an application safe to deploy. Very useful in instances where we’re in the process of mutating production data or if a deployment on staging isn’t stable.

Locking deployments of our Consumer app

Restart a container

Deployed app is stuck in a bad state and needs to be bounced? Issue a restart. This triggers a special step we added via GitLab pipelines to redeploy the container in question.

Restarting a pod

Package info

Want info on a specific NPM package? Using Bundlephobia’s APIs, pass the package and get a summary of the size damage you’re about to incur.

Package info using Bundlephobia's APIs

Deal bucks

Deal bucks (Curated's original name, deal.com) were born as a gimmicky reward system to make broader use of our shiny new Redis store. Short and to the point, but, it allows people to give visible kudos to other members of the team.

Future Plans

As Slack Bot continues to be the most talkative member of our engineering channel, we continue to think of ways we can expand on the bot’s functionality. Things like requiring sign off on deployments and interfacing further with GitLab would let us ensure a successful deployment was completed, not just initiated. Stay tuned for what comes next from Curated Labs.