Why DORA metrics aren’t enough

DORA metrics have been broadly accepted as the standard for measuring software development performance. But they paint an incomplete picture of team impact and miss important context to make the metrics actionable.

There’s more to being a high-performing team than just deploying frequently, quickly, responsively, and accurately. So, are development teams working on the right things in the right ways? That’s what’s ultimately missing from the DORA approach.

If teams are focused on the wrong priorities or burning themselves out to hit their delivery goals, they aren’t performing at a high level, as the DORA metrics might suggest.

This article will answer the main questions about measuring software development performance, including why it is so hard to do, what’s wrong with a DORA-only approach, and whether there is a better way.

1. Why is it so hard to measure developer performance?

Engineering leaders are constantly evolving their thinking around how to measure performance. Software development is an art form, and there’s no one way of measuring art.

Teams such as sales and marketing are often very data-driven, with clear, established ways to measure impact. But measuring an engineering organisation’s value is a much more complex exercise.

In 2014, the Google DevOps Research and Assessment (DORA) team introduced their four key metrics for measuring dev performance:

Deployment frequency
Lead time for changes
Change failure rate
Time to restore service

A few years after publishing these metrics in their book, Accelerate, the DORA team introduced a fifth metric to their list: reliability. That shows how quickly even the DORA team continues to evolve their approach.

The point is this is all very new, and we’re still figuring it out as an industry. The age of modern software development only began 20 years ago, and measuring dev performance is an even newer concept. So we haven’t been doing this very long, and we’re all still learning and iterating on how to do it right.

While that creates an exciting opportunity to rethink how we view performance, it can be challenging to keep up with the recent explosion of new tools and competing methodologies in the market (e.g., the SPACE framework introduced in 2021).

If leaders aren’t thoughtful about what they measure — or what they don’t measure — they could actually hurt team performance rather than improve it.

2. What’s wrong with a DORA-only approach?

As they stand now, DORA metrics measure team efficiency, not overall performance. As a result, they deliver insight into an organisation’s systems’ throughput, stability, and reliability, improving visibility into their operational health.

But efficiency is only one aspect of understanding performance. That’s why so many developers and managers who fall into the Elite DORA category don’t feel Elite.

In fact, in their 2022 Accelerate State of DevOps report, the DORA team dropped their ‘Elite’ performer classification altogether, with teams not showing enough differentiation between the Elite and High clusters.

As DORA’s benchmarks were set based on survey data rather than computed data, it’s difficult to pinpoint exactly what’s behind the drop. But one possible explanation is survey fatigue, with fewer Elite teams taking time to report on their performance.

DORA metrics have diminishing returns for the organisation. Once teams have achieved a High (previously Elite) level of performance, the metrics no longer help them improve. Trying to push beyond Elite performance can have the opposite effect, so many teams settle on a level that’s ‘good enough’ but doesn’t really make them better.

Overall, DORA metrics can provide a decent baseline for teams that aren’t measuring anything yet, but they aren’t enough on their own to make continued long-term improvements.

3. Is there a better way to measure performance?

DORA metrics are missing three key insights that can have a significant impact on performance as teams scale and improve:

Alignment insights help teams focus on the most high-impact projects by measuring performance against greater organisational goals. With greater insight into how teams spend their time, leaders can better allocate people, effort, and investments to activities that would impact the organisation more.

Capacity is how much time teams have to get work done. With DORA metrics, leaders might see that their teams are deploying quickly and frequently and infer they have more capacity, but they would have no way to quantify it.

So instead, leaders should measure how much time their teams have for Deep Work — uninterrupted blocks of two or more hours for focused work. The more capacity leaders can open up for Deep Work, the more effective their teams will be.

Together, alignment and capacity insights measure organisational focus: what teams should work on and how much time they have to work on it.

Burnout isn’t an outcome of low-performing teams but could cause low performance. If teams struggle to keep up and burn out in the process, they aren’t performing at a high level. So it’s actually pretty unsafe and unsustainable.

To quantify burnout, teams should look at ‘Always On’ indicators of work outside of normal hours hidden in their Slack/Teams, Jira, and calendar activity. This is perhaps the most important context leaders can give to their DORA metrics, as it shows the human impact of their efforts.

These insights focus on team effectiveness rather than efficiency. For example, are teams aligned around the right priorities? Do they have enough time to work on them? And are they doing so in a way that maintains team health?

The DORA team doesn’t ignore the importance of these variables in their research, but they don’t attempt to quantify them as part of their key metrics. Instead, DORA treats alignment, capacity, and burnout as performance outcomes when they’re part of the same calculation.

While DORA metrics (efficiency) are a vital part of measuring performance, engineering organisations need alignment, capacity, and burnout insights (effectiveness) to provide context — which is required to identify areas for improvement. Together, efficiency and efficacy insights can provide a more comprehensive measure of software development performance.

***

We’re thrilled to announce that some of the most exciting tech leaders will be joining us at the CTO Craft Con on 23-24th May at the Tobacco Dock in London.

If you or your CTO / technology lead would benefit from any of the services offered by the CTO Craft community, use the Contact Us button at the top or email us here and we’ll be in touch!

Subscribe to Tech Manager Weekly for a free weekly dose of tech culture, hiring, development, process and more.

Why DORA metrics aren’t enough

1. Why is it so hard to measure developer performance?

2. What’s wrong with a DORA-only approach?

3. Is there a better way to measure performance?

Author

Uplevel

Categories

Recent Posts