BuildKite is an excellent tool for a growing SaaS product to set up continuous integration and delivery. I’ve recently configured it running inside Kubernetes on Google Cloud, pushing images to AWS ECR and rolling out into a completely separate Kubernetes Cluster; here’s how.

Kubernetes Build Cluster

First, you need a place where BuildKite will actually execute the builds; Kubernetes is an excellent choice, and there’s no easier place to host a cluster than Google Cloud.

Head over the Google Cloud Admin Portal, click on Kubernetes Engine, and Create a Cluster.

You can configure the cluster however you’d like, but I’ll make one recommendation here: because build clusters generally have lower availability guarantees than you main product, its likely acceptable to enable some of the features which are more “unstable” and would help with cost and maintenance of the cluster. I’m thinking:

  • Automatic Node Upgrades? Turn it on.
  • Automatic Node Repair? Go for it.
  • Preemptible Nodes? Why not? Unless you have really long running builds, the probability of a build failing due to a node going down is low, and if it does, a new node will be created and that node (or another node in the cluster) will grab it when it re-runs. To save 70% or more, its a no-brainer.

Configure BuildKite

Let’s get some configuration into the cluster. Sign up for a BuildKite account, create an organization, then head over to the Agents tab. On the right, you’ll see an Agent Token section; reveal it and copy the token.

Agent Token

Head back to Google Cloud and connect to the cluster. I’ll just use the Cloud Shell for this demo, but you can also connect locally if you’ve got it already set up. Let’s make sure we can connect to the cluster.

Cluster Connection

Then lets create a secret for the buildkite token.

kubectl create secret generic buildkite --from-literal token=YOUR_TOKEN_HERE
> secret "buildkite" created

Configure GitHub (Private Repos Only)

If you’re building private repositories, you’ll need to authenticate the agents to be able to clone the repositories during the builds.

From the Cloud Console, generate a new ssh keypair.

SSH KeyGen

cat out the content of the .pub file that was created, copy it, then head over to Github and add it to your account.

Github SSH

Finally, let’s create a new Kubernetes secret with this information.

kubectl create secret generic ssh --from-file id_rsa=./github-key --from-file id_rsa.pub=./github-key.pub
> secret "ssh" created

Configure AWS

In this guide, we are going to push our images up to AWS ECR. If you use Google Cloud Container Repository (and why wouldn’t you be?) then you can skip this part, but we aren’t going to cover that piece of authentication.

We’ll need authentication credentials inside the build cluster. Provision those on AWS IAM, then create a new secret with them.

kubectl create secret generic aws --from-literal region=us-east-1 --from-literal access-key-id=YOUR_ACCESS_KEY_ID --from-literal secret-access-key=YOUR_SECRET_ACCESS_KEY

Configure the Target Clusters

If your goal is to deploy your application to another Kubernetes Cluster, we’ll need to authenticate the BuildKite agents to be able to rollout the new images it builds to the cluster. We’ll use a service account for this.

I’ll assume you already have kubectl configured to contact the target cluster, and you already have all the deployements and such set up for the application you’re building.

Inside your application clusters (not the build cluster, unless you’re modifying this guide to host both your builds and application in the same cluster), create a new service account.

kubectl create serviceaccount buildkite
> serviceaccount "buildkite" created

Run a kubectl get secret and find the secret that looks like buildkite-token-mpzdr. We need the token for this account; if you have jq installed, you can just issue this command, otherwise copy-paste the appropriate field from the json it outputs and throw it into base64 -D.

kubectl get secret/buildkite-token-mpzdr -o=json | jq '.data.token' -r | base64 -D
> eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2...

Next, lets create another secret on the build cluster.

kubectl create secret generic clusters --from-literal token="PASTE_TOKEN_HERE" --from-literal url="URL_OF_APPLICATION_KUBE_API_SERVER"

Done.

Creating Agents

We’re now ready to create some agents inside the cluster.

BuildKite provides a basic docker image which runs their agents. However, it doesn’t include a lot of devops tooling that you might need, especially if you’re deploying to kubernetes or using AWS.

I’ve published a docker image based on the buildkite agent image which includes the AWS CLI and Kubectl; you can download it at mikehock/buildkite-agent-aws-kube. If you need other devops tools, check out the dockerfile for that image and modify it appropriately.

Let’s create a deployment. Back on our build cluster, we’ll use the following deployment template. I’ve annotated some of the important parts with comments.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: buildkite-agent
  name: buildkite-agent
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: buildkite-agent
    spec:
      containers:
      - env:
        # We inject the AWS authentication as standard AWS environment variables here.
        - name: AWS_DEFAULT_REGION
          valueFrom:
            secretKeyRef:
              key: region
              name: aws
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              key: access-key-id
              name: aws
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              key: secret-access-key
              name: aws
        # Here's that buildkite build token.
        - name: BUILDKITE_AGENT_TOKEN
          valueFrom:
            secretKeyRef:
              key: token
              name: buildkite
        # Here's our authentication to the kubernetes cluster. We will reference these environment
        # variables later when we are making kubectl commands.
        - name: KUBE_TOKEN
          valueFrom:
            secretKeyRef:
              key: token
              name: clusters
        - name: KUBE_URL
          valueFrom:
            secretKeyRef:
              key: url
              name: clusters
        # You can replace this image with whatever you'd like if your needs require other devops
        # or testing environment tooling.
        image: mikehock/buildkite-agent-aws-kube:latest
        # Here, we define a postStart lifecycle hook to log us into AWS ECR. This command could be
        # ran during the build pipeline, but you'll need to make sure its ran on the _same agent_
        # that is actually doing the push, which isn't always a guarantee with BuildKite unless
        # you include it in the same pipeline step as the push. Instead, we just run it here when
        # the container starts.
        lifecycle:
          postStart:
            exec:
              command:
              - /bin/sh
              - -c
              - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
        name: buildkite-agent
        # In order to do docker builds, we need to grant a privileged security context to this 
        # container.
        securityContext:
          privileged: true
        volumeMounts:
        # Here, we mount our ssh key into the container
        - mountPath: /root/.ssh/id_rsa
          name: ssh-keys
          subPath: id_rsa
        - mountPath: /root/.ssh/id_rsa.pub
          name: ssh-keys
          subPath: id_rsa.pub
        # We also mount the docker binary and the docker socket, which the container is allowed to
        # access because it is running in privileged mode.
        - mountPath: /usr/bin/docker
          name: docker-binary
        - mountPath: /var/run/docker.sock
          name: docker-socket
      volumes:
      - hostPath:
          path: /usr/bin/docker
          type: ""
        name: docker-binary
      - hostPath:
          path: /var/run/docker.sock
          type: ""
        name: docker-socket
      - secret:
          defaultMode: 256
          secretName: ssh
        name: ssh-keys

If that all goes well, you can run kubectl create -f deploy.yaml and see it spin up three pods. You can confirm everything is wired up by checking out the BuildKite UI for your agents.

Agents

Setting Up the Pipeline

Almost done. Finally, we just need to configure the build pipeline. You can set this up however you’d like, but I’ll attach an example pipeline here so you can see how we interact with AWS ECR and Kubectl.

steps:
  - command: "docker build -t myapp:$BUILDKITE_COMMIT . && docker tag myapp:$BUILDKITE_COMMIT AWS_ACCOUNT_NUMBER.dkr.ecr.us-east-1.amazonaws.com/myapp:$BUILDKITE_COMMIT && docker push AWS_ACCOUNT_NUMBER.dkr.ecr.us-east-1.amazonaws.com/myapp:$BUILDKITE_COMMIT"
    label: "build"

  - wait

  - block

  - command: "kubectl --server $KUBE_URL --token=$KUBE_TOKEN --insecure-skip-tls-verify=true set image deploy/myapp myapp=AWS_ACCOUNT_NUMBER.dkr.ecr.us-east-1.amazonaws.com/myapp:$BUILDKITE_COMMIT"
    label: "deploy"

  - wait

  - command: "kubectl --server $KUBE_URL --token=$KUBE_TOKEN --insecure-skip-tls-verify=true rollout status deploy/myapp -w"
    label: "watch"

Note one issue with this setup; we aren’t telling kubectl about a valid tls certificate, so its ignoring any HTTPS validation. I’ll leave that as an exercise to the reader to configure properly, but you should definitely look into it for production deployments.

That’s It!

Head over to your BuildKite pipeline and initiate a new build. You should see it build the docker image, push it to ECR, you can click the button to pass it through the block, then it should roll it out to the cluster.

And Its Cheap

BuildKite is a free service for basic projects. Moreover, because we used preemptible nodes and Google charges nothing for the Kubernetes master node, this setup will only cost us around $22/month. For a complete CI pipeline, 3 vCPUs enabling 3 concurrent builds (at least!) and almost 12gb of memory? Its a steal.

Additionally, you get complete control over your build infrastructure. If you have projects that need really large CPUs or lots of memory, you can configure that.

You can even target specific pipelines or buildsteps to only run on nodes that are configured with those attributes; hell, you can run the build agents inside your application cluster, create a separate preemptible instance group, target the agents to run on that group, and keep it all under one roof. The sky’s the limit!