Introducing Painless Pipelines

Tom Oram
Cloudnative.ly
Published in
7 min readJul 24, 2023

--

Continuous Delivery pipelines exist to make the software development process easier. They aim to improve flow and simplify the process of getting ideas to production. They remove many of the pains of doing this process manually. However, depending on how they are implemented, they can introduce new pains within the team. I have been considering what can cause these new pains and how to avoid them.

Summary image of the pain, modes of interactions, principles and practices discussed in this article.

What is a Painless Pipeline?

A painless pipeline does not create any new pains. It delivers the promised benefits of CD pipelines without making additional cognitive load for the developers on the team.

Typical Pipeline Pains

Some of the common pains I have seen in pipelines are:

  • Hard to debug: “Why did it fail?”
  • Hard to make changes: you have to keep pushing fixes until it works
  • Slow feedback: wait minutes or more to see if a config change worked

Often, these pains are created by situations where pipelines are developed either by focusing on only the happy path (with a “just get it done” mindset) or were created by different people (e.g. DevOps engineers, contractors, consultants).

Modes of Interaction

There are three different ways a software team interacts with a pipeline. These are:

  • When it is green (all jobs have successfully passed)
  • When it is red (a job has failed)
  • When a pipeline change is required (e.g. adding a new job)

Let’s consider each in turn.

When it is green

When a pipeline is green, there is generally little interaction with it. It is usually a sign that everything is well and the team can continue their work.

The only problem that might occur with a green pipeline is when something is actually broken; a pipeline that tells you everything is okay when a failure has happened behind the scenes is creating pain, so your pipelines should always aim to fail if there is a problem.

When it is red

When a pipeline is red, then some action needs to be taken. If the pipeline clearly points to the problem and how to fix it, then it is helping us. However, if the failure sends us off on a journey of searching and confusion, it creates more pain.

There are different types of failure that can cause your pipelines to turn red. These include:

  1. Genuine failures
  2. Environment failures
  3. Transient failures

In the case of a genuine failure, it helps if it clearly tells us what went wrong. The same applies to environmental failures; however, this may need to be supported with additional tooling outside of the pipeline (dashboards, alerts, etc.). Finally, we want to try and stamp out transient failures altogether; if we have any jobs that “just fail sometimes”, they need to be treated as broken and fixed.

When pipeline changes are being made

The final interaction we have with our pipelines is updating them — when we need to add to them, change them, or fix them. If making these changes is unpredictable and frustrating, it is another pain source. An example is when you need to make a change but don’t know if it has worked until you have committed it, pushed it, and waited for the pipeline to run it — only for it to fail with a trivial mistake.

For pipeline changes to be painless, the config should be easy to work with and quick to test.

Making Pipelines Painless

To make your pipelines painless, I have come up with four principles, and four practices to help you realise those principles. How you realise these might vary depending on your tools, but I hope the ideas apply to them all.

The Principles

  • Treat your pipeline as a software product
  • Design your pipelines to be easy to use
  • Failure is a feature
  • The developers are also the users

The Practices

  • Be defensive
  • Be atomic
  • Be expressive
  • Be reusable

Now, let’s look at each of these in turn.

The Principles

The principles are a mindset. They don’t aim to tell you how to do anything, but they guide the decisions you make.

Treat your pipeline as a software product

Too often, pipelines are developed with less care than the thing they are deploying. They are cobbled together until they work, then forgotten about until they break or need updating. There are a few reasons for this; for example, it could be that the application developers do not have expertise with these tools, or it could be that there is pressure to deliver the user-facing features faster. Often, a “DevOps” engineer or team is brought in to create them, but these creates silos and reduce team ownership.

This principle is about treating your pipeline projects like your software projects. Design them well, refactor them, test them, and ensure that they are implemented to a standard.

Design your pipelines to be easy to use

The people who interact with the pipeline (primarily the development team) are the pipeline users. Make the pipeline easy and informative to use; make it tell you what has failed, what is wrong, and how to fix it.

Failure is a feature

Pipelines often have been built with a focus on the happy path. They work well when green but don’t provide a great user experience when they fail. However, failure is an important feature of a pipeline; it is the one we interact with the most so we need it to help us. Make failures helpful and informative.

The developers are also the users

One unique thing about pipelines is that the developers who build them are likely also the users. That means that changing and developing them is part of the users' experience. Therefore, we should treat the ability to maintain the pipeline as one of its features. Making changes should be easy, predictable, and as safe as possible.

The Practices

The principles above provide an idea of how we should treat our pipelines. The following practices aim to turn them into concrete actions.

Be Defensive

When the pipeline performs a task, it should explicitly check for any problems and report clear errors if anything is incorrect.

This practice helps us when we want to make changes to the pipeline by quickly informing us what went wrong. A pipeline should give a descriptive error when the task fails. Here are some examples of why a pipeline might fail:

  • An error in an artifact
    e.g. the code is broken
  • An environment error
    e.g. the deployment environment is broken
  • A pipeline config/scripting error
    e.g. you forgot to provide a required environment variable

When the pipeline fails for one of these reasons, the error should help you resolve the issue. It should do one of the following:

  • Describe the problem
    e.g. “The required AWS_ACCESS_KEY environment variable is missing.”
  • Redirect you to a place where you can get more information
    e.g. “The production environment is not reachable; please check http://monitoring-dashboard.com

Example:

In this example, we have a bash script as part of the pipeline job which substitutes some values into a Google AppEngine app.yaml.

#!/usr/bin/env bash

set -euo pipefail
# …
sed "s/#DB_HOST#/${DB_HOST}/g" app.yaml.tpl | \
sed "s/#DB_USER#/${DB_USER}/g" | \
sed "s/#DB_PASSWORD#/${DB_PASSWORD}/g" | \
sed "s/#DB_DATABASE#/${DB_DATABASE}/g" >app.yaml
# …

# gcloud command to deploy the app

When pushed, the pipeline reports the following error:

500: An unknown error occurred

Thanks Google!

Eventually, we discovered that the app.yaml.tpl file had an extra #PROJECT_ID# placeholder that had not been substituted. We could have saved time searching for this issue if the script had also checked for any placeholders that had not been replaced and reported an error like so:

Error:
app.yaml contains an unsubstituted template variable #PROJECT_ID#

Be Atomic

Each step should be easy to run in isolation when given a set of inputs and expected outputs.

  • Each step should be a script, program, or configuration that can be run locally on a developer's machine.
  • Each step should have stubs, mocks, or whatever is required to allow it to be run and verified locally (a “unit test”).
  • All steps can be run as a part of a test suite.
  • If a step cannot be run locally, it should be run as a stand-alone step in a separate test pipeline.

Example:

$ export DB_HOST=localhost
$ export DB_USER=appuser
$ export DB_PASSWORD=secretsecret
$ export DB_DATABASE=customers
$ ./substitute-templates — template app.yaml.tpl — output app.yaml — var-source environment
$ ./run-pipeline-unit-tests
Tag and release tests
✅ creating git tag succeeds
✅ creating git tag fails when tag already exists
✅ creating git tag fails when tag has incorrect format AppEngine Deploy tests
✅ substitute template vars in app.yaml
✅ substitute template vars in app.yaml fails when env var is missing
❌ substitute template vars in app.yaml fails when env var is not substituted

Be Expressive

Use languages and tools that make coding and error reporting easy and clear.

  • Use an appropriate programming language.
    i.e. don’t resort to bash or PowerShell because it’s quick.
  • Create/use tools that can be reused.
    e.g. a templating tool that performs validation.
  • Practice Test-Driven Development (TDD).

Example:

step:
name: prepare app.yaml
type: shell
command: |
#!/usr/bin/env bash
set -euo pipefail# Substitute env vars
sed "s/#DB_HOST#/${DB_HOST}/g" app.yaml.tpl | \
sed "s/#DB_USER#/${DB_USER}/g" | \
sed "s/#DB_PASSWORD#/${DB_PASSWORD}/g" | \
sed "s/#DB_DATABASE#/${DB_DATABASE}/g" >app.yaml
# Check for unsubstituted env vars
grep -i '#[A-Z]*#' app.yaml
step:
name: prepare app.yaml
type: shell
command: python substitute-templates.py — template app.yaml.tpl — output app.yaml — var-source environment

Be Reusable

Invest in building tools you can use again.

  • If you are investing more time in building tools right, build them to be reused.
  • Reuse your tools in different steps, pipelines and projects.
  • Make them available to others (via open source if possible).

Example:

step:
name: prepare app.yaml
type: substitute-templates
options:
template-file: app.yaml.tpl
output: app.yaml
var-source: environment

Conclusion

Your pipelines are a critical part of your development process, therefore, treat them as you would treat your software projects. They should not be a string of minimum-viable-scripts, but rather tools that support the people interacting with them.

The things I have presented in this article are not the complete picture, there are other important things in pipelines, such as ensuring that the sample artifacts are used throughout, making sure pipelines don’t take too long, and ensuring that the jobs are idempotent. However, the practices and principles I have presented should be added on top of these common practices to enhance the user experience.

Also, pipelines alone do not solve all the problems. Having good observability (including logs, dashboards, alerts, and traces where appropriate), will also contribute to the painless experience.

--

--

Passionate about all aspects of software. Engineer at Armakuni.