Giving CircleCI a focus on continuous deployment

Sam Bryant
Cloudnative.ly
Published in
4 min readFeb 24, 2020

--

We have recently started to use CircleCI workflows to create continuous deployment pipelines for some of our projects.

From the heading of this article, you might have guessed that we felt CircleCI was lacking in some areas out of the box. On a purely technical standpoint that might not be entirely fair. Workflows in CircleCI allow us to define a good continuous deployment workflow. The image below shows a typical example.

So, what is wrong? Why don’t we think there is enough focus on continuous deployment, or at the very least, high performing continuous deployment?

To answer that question; we need to set some context, what do we mean by “high performing continuous deployment”.

What is high performing continuous deployment?

Recovering quickly is important in more than just the software industry. Take the Audi Le Mans domination from 2000–2014 as an example. At their peak, they could replace a gearbox in just 5 minutes, while it took other teams hours.

A good starting point is to look at some work done by Jez Humble and Martin Fowler. They talk about three questions that they use as “Continuous Integration Certification Test”.

“when the build fails, it’s usually back to green within ten minutes”

To summarise Jez Humbles test, he gets his audience to put their hands up and asks them to keep their hands up while he asks a few questions. His final question is, keep your hand up if “when the build fails, it’s usually back to green within ten minutes”.

To hit the goal of fixing builds that quickly we need more than just technology. It is about a way of working, Jez Humble talks about this in more depth in some other articles. When talking about pipelines that are triggered on every change in version control he states:

“If this initial commit stage fails, the problem must be fixed immediately — nobody should check in more work on a broken commit stage.”

To add some more weight to the focus on process, some, like James Shore talk about CI being an attitude rather than a tool, and when you look at things that way it really does make sense, the tools enable us to make a CI process streamlined but they don’t change what we are trying to achieve an what our priorities are.

This is where things start getting interesting. We know that we should be fixing our builds quickly; we know that in order to do that the team needs to focus its workflow on prioritising build failures over feature delivery. What we don’t have an answer too yet is, how do we enable this behaviour?

To link this back to our title, CircleCI has some default views to let us see the state of our pipelines. You can also use CircleCI insights (if you are a paid user) to get an even more detailed and helpful view.

These views are not bad, they provide some really good information but do they help us focus on team priorities?

The core of the issue is human behaviour, it doesn’t take much for us to become complacent, if we see red on a dashboard regularly we stop reacting too it. The default views show build history, which means even if the latest build is green, we are still seeing the red from the previous runs.

What is the solution? We have open-sourced a dashboard.

Why do we think this helps?

  • It has been developed with a focus on the behaviours that we have been discussing in the article.
  • Clear red boxes when a workflow fails.
  • Visible states to show if a pipeline is currently running or on hold (requiring manual approval).
  • Promotes swarming on failing builds.

Why is swarming on failing builds beneficial?

  • The team collectively solve problems and own quality.
  • Build shared knowledge.
  • Remove single points of failure, solutions being in the head of a single team member.

A word on running/on hold jobs

In order to allow your teams to know exactly when they should drop everything and fix the build then we need to provide some extra information.

What if a workflow is red but currently running? Someone may have already pushed a fix that resolves the issue, don’t double the effort, let the current run finish before context switching.

CircleCI has out of the box options to put pipelines on hold for a manual approval (continuous delivery rather than deployment). In order to drive value, you need these manual approvals to be as quick as possible. Our tool will highlight any workflows that are currently pending approval so that you can prioritise getting approval and getting your features out to customers.

What next?

We built this tool to solve an immediate requirement and we think it could be useful to others as well. What else do you want from your dashboards? We would love to hear any feedback and have discussions on how a simple dashboard can help drive behaviour change to empower high performing continuous deployment.

--

--