The book Accelerate details the findings of four years of research on how DevOps affects various outcomes, such as software delivery tempo and stability, as well as the organizations’ profitability and market share. DevOps in this context means things like continuous delivery, automated tests, trunk-based development, and proactive monitoring of system health. It is quite clear that DevOps practices bring lots of benefits to organizations adopting them. The research findings are also in line with my own experience of DevOps.
The findings of the research are presented in the first part of the book (a bit more than half of it).
Software Delivery Performance
Many aspects of software development are hard to measure. I really liked that the authors have thought hard about what to measure in order to get objective yet useful metrics. A central concept in Accelerate is Software Delivery Performance. This is measured by four criteria: lead time, deployment frequency, time to restore service, and change fail rate.
Lead time is defined as time from code committed to code successfully running in production. Answer options included less than an hour, less than a day, and all the way up to more than six months. Deployment frequency had similar answer options: on demand (multiple times a day) up to fewer than once every six months. These two criteria together make up the tempo – how fast is software delivered.
Time to restore measures the time to restore service when there is an outage or a disturbance. The change fail rate is the percentage of times a change to the system (such as a deploy or a configuration change) fails. These last two criteria define the stability. High software delivery performance means high tempo and high stability. One key finding of the research is that there is no trade-off between tempo and stability. Instead, organizations with high tempo also have high stability. Furthermore, they also found that organizations with high software delivery performance had better profitability, market share and productivity than low performers.
The authors identify 24 capabilities that drive improvements in software delivery performance. These are grouped into five categories: Continuous delivery, Architecture, Product and process, Lean management and monitoring, and Cultural. I won’t go through all capabilities here. Instead, I will concentrate on the ones I found most interesting.
Continuous delivery. Using version control for code is probably universally accepted as good practice. However, all forms of configuration and scripts should also be kept in version control. There should be automated tests, and they should be run at every commit. The authors also found that for automated tests to be effective, they should be created by the developers themselves. This is probably because when the developers own the automated tests, the designs will be more testable, and they will invest more effort in keeping them running. Also, the highest performing organizations use trunk-based development: branches have very short lifetimes (less than a day), and there are never any code freezes.
Architecture. The key aspect is that the systems are loosely coupled – meaning that you can easily test and deploy individual components or services, without needing permission or coordination with people outside of the team. Interestingly, these characteristics could be found in all types of systems, including embedded systems and off-the-shelf software. Another driver of performance listed in this category is the ability of the development team of deciding what tools they should use, rather than only using tools on a pre-approved list.
Lean practices. Working in small batches and limiting Work In Progress (WIP) is important. The same goes for having visual displays showing quality and productivity metrics, and proactively monitoring the system health. Finally, having a lightweight change management process for making changes to the production environment. Teams with no change approval process, or only peer review for changes, have higher software delivery performance. The use of a Change Advisory Board (CAB) is negatively correlated with delivery performance – they don’t help the stability of the system (and obviously not the tempo either).
Cultural. Adopting DevOps is positive for employee satisfaction, identity and engagement. One example is the employee Net Promotor Score (eNPS). It is calculated from the answer to the question “Would you recommend your team as a place to work to a friend or colleague?”. The answers are on a scale from 1 to 10, and people answering 9 or 10 are considered promotors. Employees in high-performing teams were about twice as likely to be promoters compared to employees in low-performing teams. Another example is that adoption of DevOps leads to lower rates of burnout.
The second part of Accelerate contains details on how the research was performed, and the reasons why it was done that way. All input came from survey responses. Some people think it is better to get data directly from systems, such as from version control systems, instead of relying on people answering questions.
However, they make good arguments for using surveys. Getting data out of diverse systems is very difficult. Also, finding of how much is under version control is not possible by only checking the version control system, since it does not know about files not there. Some things, such as team culture, can not be measured from systems – you have to ask people directly if they for example think new ideas are welcomed or not.
When it comes to writing good survey questions, there are many pitfalls. The authors give good examples of bad and good questions when it comes to avoiding leading questions (“Was Napoleon short?” vs “How would you describe Napoleon’s height?”), loaded questions (“Where did you take your certification exam?” doesn’t give you a way to answer if you didn’t take the exam), multiple questions in one, and unclear language. Elsewhere in the book they also talk about how it is better to ask questions about concrete practices like “How often do you run integration tests?” instead of “Do you use continuous integration?”, since the CI may be interpreted differently by different people. These types of questions also avoid the problem of answer options being considered “good” or “bad”.
All the questions (or rather statements) in the surveys use a Likert-type scale. You can answer between “Strongly disagree” (score 1) to “Strongly agree” (score 7). This allows for more nuance than simple yes/no answers, and the numerical values allow for statistical analysis.
The final chapter in this section explains how they found survey respondents (through referral sampling), with arguments why this is a good way of finding respondents. Even though this second part of the book is more for background on how they got the data, I found it surprisingly interesting, especially the chapter on writing good survey questions.
Having read this second section, and the appendices, there are still some aspects of the research that are not clear to me. For example, they list the 6 kinds of data analysis that are commonly used. The second level (after Descriptive) is Exploratory analysis. It looks for relationships among the data. The third level is Inferential predictive, and was used in this research. If I have understood it correctly, you formulate a theory, and then you test if the data supports it. However, what was not clear to me was if that establishes causation, or if it is only showing correlation. If you know the answer to this, please let me know in the comments.
I also did not quite understand how some of the data quality checks work. But the checks they describe seem to be well-established among researches, so I do believe they are relevant and correct, even though I didn’t get all the nuances.
In summary, the methodology section makes a strong and thorough case for believing the results reported in the first section of the book.
There are also some subjects covered that I haven’t mentioned above, for example Westrum organizational culture and transformational leadership. The third part of the book is an example of many of the principles from the book being used at ING Netherlands. To me it reads mostly like marketing phrases (“intimate spaces to gather, visit and share ideas”, “put quality first”, “consistent coaching” etc), and it did not add any value for me. Also, the organization model described seems to be copied from the Spotify model, with tribes, squads and chapters.
For the past several years, I have been using many of the DevOps practices described in Accelerate. My own experience of them is that they are incredibly helpful, especially being able to deploy to production multiple times a day. In Accelerate, there is research confirming what I intuitively already know. However, even though I already have a positive view of DevOps, I was surprised at how the benefits extended to so many areas (for example reducing burnout). If you want to persuade somebody to start using DevOps, give them a copy of Accelerate to read. That should do it.