Technical Debt: How much is a good thing?

Technical Debt is a topic that is much discussed and usually in negative terms. I’m not going to make the case that you should have lots of it, but I am going to make the case that discussion about it is sometimes not as nuanced as perhaps it should be. 

A little while ago, On Exactitude in Technical Debt from one of the O’Reilly Programming Newsletters put this well. The piece makes sensible points about Technical Debt being an analogy, not a precise measure, and not going over the top in trying to quantify or indeed trying to quantify at all. 

I agree with the highly relatable “Not to get obsessive about things”, and I agree that we need a currency in which to exchange ideas with non-technology people, but there are a few things I don’t especially agree with, which I will explore here.

Software doesn’t rust?

I have frequently used an analogy with building maintenance – apt, possibly, when your CFO is also your facilities manager. Making the point when you take a lease on a building, you expect to pay for cleaning, for replacement of furniture and for redecorating at various intervals – and that you make provision for this in your accounts. The cost of ownership, so to speak, is tangible.

Software isn’t exactly like that, though, is it? The software remains in more or less the state it was when you installed it; modulo bug fixes on the way. 

So when the article talks about “Software Rot” or “Entropy”, that’s not really apt, is it? The software itself is not changing; it’s the environment it finds itself in that is changing. 

Software doesn’t rust, as Joel Spolsky said more than 20 years ago in Things You Should Never Do, Part 1, where he says you should not go for a ground-up rewrite, ever.

Spolsky doesn’t mean that you should not do anything at all about your code, he’s just saying that much of its hairiness is likely down to bug fixes along the way that later generations of engineers will have no idea about the need to fix, and if writing from scratch would simply re-introduce bugs which were long ago fixed. 

He also makes the point that an engineer’s judgement about a piece of code they are working on is always “It’s an unholy mess” and suggests that it is down to three main concerns: structural, efficiency and aesthetic. He suggests incremental remedies for these that fall short of a rewrite.

It ain’t broke, don’t fix it

I think he’s right to observe that any engineer presented with almost any piece of code that they are asked to maintain or modify will say, “That’s an unholy mess”. 

This is, in fact, my first law of software engineering: if you ask a plumber to visit your property to fix something or to install something, the first thing they will do is suck through their teeth and say: “Who installed this for you then?” Rabin’s first law says that all engineers are plumbers. A theme I have expanded on many times before and once in writing: Star Trek’s Scotty, Plumbers and CTOs.

I’m a big fan of Joel Spolsky, and once had the pleasure of him visiting my workplace, when I had the honour of hosting a question-and-answer session with him. It was great to see how many fanboys (of both sexes) came to hear him. So I feel a little impudent in suggesting that his analysis doesn’t quite go far enough.

Measuring how broke it is

The fact that a piece of software is structurally poor will lead an engineer to say it needs a rewrite – as an instinctive aesthetic reaction. Yet ugliness is not a reason, on its own, to refactor (and especially not rewrite). 

Much more important are reasons like enhancing it is taking an increasingly long time (the main reason cited in the O’Reilly article); hence there is a real cost to it being the way it is. And a foreseeable economic argument that says that a refactor is actually worth the investment on the grounds of reduced future costs (of development), reduced future costs of operation, increased reliability and reduced risk.

So, in the end, to contradict the O’Reilly piece, I think that there is a quite sensible and pragmatic quantitative measure that can be put on technical debt that is completely outside any aesthetic judgement, other than, perhaps, the credibility of the engineering team in asserting lower future costs once they get rid of the previous plumber’s incompetent work. 

At least it makes a basis of management choice that is grounded in business assumptions: if we do it, we get these business benefits, if we don’t, then we get these disbenefits.

It is broke, but still don’t fix it

Also, completely related to this topic is knowing what phase your particular piece of technical debt is in. And this is where I am saying that Spolsky is not going far enough. As Alice is asked in Alice’s Adventures in Wonderland:

“No wise fish would go anywhere without a porpoise.”

“Wouldn’t it, really?” said Alice, in a tone of great surprise.

“Of course not,” said the Mock Turtle. “Why, if a fish came to me, and told me he was going a journey, I should say, ‘With what porpoise?’ ”

The porpoise in this case is to ask why you are committing resources to this, now. Given that you are resource constrained, what are the top projects that you could commit to that would maximise economic value for the company?

Consider if the product you are looking at has limited future revenue growth potential, for example, the chances are that any investment in its technical debt will have a very poor comparative return.

Unless you are going to improve its operational efficiency or reliability so much that it warrants diverting resources from your newer, larger, better flagship effort, which requires (relatively speaking) a huge array of enhancements to reach its market potential – and in which you could easily be in some kind of feature race with your competition.

You really should have some Technical Debt

It’s really worth making the point that having debt is not, per se, a bad thing. In fact, trying to avoid debt as a pre-eminent concern is a bad thing, especially if you are in the feature chase phase of a product’s lifetime. Consolidate your code after showing that the feature is to be retained and that it’s not going through rapid iteration and feedback cycles.

As a long-term proposition, though, there are additional considerations.

Nations run on debt, businesses run on debt, and servicing the debt is a perfectly normal part of everyday operations. In this kind of situation, debt usually gets cheaper the longer you have it, because of inflation. 

So by analogy, the choice that faces an organisation with technical debt is not “We need to get rid of our debt” but more “Is our debt sustainable?”. This is in the sense of “How much does it cost us to have it, given where that product is and where other products may need to go?” so “Should we reduce the debt, or invest in higher value things?”. And crucially, is the debt actually getting cheaper, or not?

Unsustainable or Dangerous

You do hear horror stories about things that have been left to their own devices for much too long. And I suppose I bear witness to the fact that I have seen this in action, where there is 50-year-old code running on 60-year-old machines, with hardware that has to be bought on eBay because it’s not available anywhere else. 

No one knows the code, no one could possibly fix it if it went wrong, and no one can reasonably write replacement code since we don’t know in detail what it does.

This is the other side of the Spolsky “software doesn’t rust” coin. The software may not have rusted, but the environment has moved on. The level of risk associated with it and the cost of operation all point towards ignoring Spolsky and going for the rewrite.

Given that you probably don’t know in as much detail as you’d like what it does, you do have a problem. Add to this any reasonable questions you may have about the capability of the engineering team to deliver. Martin Fowler’s “strangler pattern” comes up as a means of moving forward in this context.

Technical Debt and MVP

At the other end of the spectrum, early-stage companies don’t have 50-year-old software and quite likely have no hardware at all. Here are some thoughts which may be relevant:

  • Do hack out code in an attempt to test an MVP idea.
  • Don’t provide extensions for enhancements unless they are clearly close to an established roadmap item.
  • Don’t work on scalability beyond what is required to test the idea.
  • Don’t focus on reliability etc., until you have shown the idea has legs, then only in proportion to a commercial view as to loss of revenue, reputational damage etc.
  • Don’t even focus on maintenance and sustainability till you know the idea is sound and is worth investing in.
  • Do abandon ideas ASAP.
  • Do fix security problems.
  • Do have an established commercial model in which you can discuss the business pros and cons of exploring new features vs improving established features (especially in a non-functional sense).

All of this means that if you don’t come out of an MVP process with some abandoned code that supports some ideas that didn’t fly and a variety of sub-optimal code that supports ideas that did fly, then you’ve possibly done it wrong. The thesis is that all MVPs leave you with Technical Debt, so it’s hardly surprising that you have it. And if you don’t, you’re either delusional or you’ve wasted a lot of time …

***

CTO Craft Conference is BACK! We’re already planning the next London conference for 7-8 November. Remember, our first conference was a sell-out, so pre-register now to be the first to hear as soon as tickets go on sale.

If you’re not a member of the free CTO Craft Community, what are you waiting for? With over 10,000 global technology leader members, you’ll get exclusive access to Slack channels, conference insights and updates and other valuable content. 

Subscribe to Tech Manager Weekly for a free weekly dose of tech culture, hiring, development, process and more.