Two-thirds of AI projects don’t reach production. That’s not an AI problem

Our latest CTO Craft Bytes session, Mind the Gap: Why Two Thirds of AI Projects Never Reach Production, explored a problem many technology leaders recognise instinctively but rarely see quantified. Drawing on findings from a survey we conducted alongside Ten10 and The Scale Factory, the session was hosted by Mike Mead, CPTO at The Scale Factory, and brought together Michelle McDaid, founder of The Leading Place, and Ash Gawthorpe, CTO and co-founder at Ten10.

The survey put numbers on a pattern many leaders already felt in their day-to-day work. Two-thirds of organisations reported a success rate below 50% when it comes to moving AI proofs of concept into production, with only a small minority doing this consistently well. What stood out wasn’t just the scale of the problem, but how familiar it felt to people who’ve spent years trying to ship complex systems inside organisations that rarely behave as neatly as plans suggest.

As the conversation developed, it became clear that very little of what was being discussed was exclusive to AI. Most of the barriers sounded familiar to anyone who’s been in the role for any time. Production readiness still tends to be pushed back. Data quality continues to cap what teams can realistically build. Integration exposes hidden complexity once systems have to interact with the wider organisation. Security and compliance still arrive later than anyone would like or need. None of this is new, but AI has brought about a way of accelerating the moment where these weaknesses surface and become impossible to ignore.

“They’re seeing the 10% above the water and not the 90% beneath it.”
Ash Gawthorpe, CTO and co-founder at Ten10
The POC trap and the danger of convincing demos

A lot of the discussion in the Bytes session circled around the power of a good demo. A prototype that behaves plausibly on screen can answer questions before they’re asked and create momentum without forcing anyone to confront the less visible work that sits underneath.

Once something looks credible, the conversation often shifts almost automatically from whether it can work to why it isn’t live yet. At that point, the remaining work, which is usually operational rather than visible, starts to feel like friction rather than a necessary part of delivery. That’s where many teams are unfortunately finding themselves stuck, not because the technology has failed, but because the expectations have outrun reality without anyone explicitly stating them.

“The end result might look the same, but it’s a very different set of skills and an entirely different business.”
Mike Mead, CPTO at The Scale Factory

Being able to make something work in a controlled or limited setting doesn’t mean it can be operated reliably at scale, integrated safely into existing workflows, or supported when it inevitably breaks. The output might look similar on the surface, but the effort required to keep it running predictably over time is fundamentally different.

Rather than arguing for fewer experiments, the discussion kept returning to clarity. Experiments need space to exist without being treated as half-finished products, while leadership needs a shared understanding of what “done” really means once something is expected to run as part of the organisation. Without that alignment, even a successful pilot can become a source of frustration rather than progress.

Production readiness, people, and the work we keep deferring

A related issue that came up was how often production readiness is framed as something to deal with later, particularly when teams are under pressure to move quickly. Over a third of organisations surveyed admitted they build AI prototypes without seriously considering how those systems would ever reach production.

It’s not hard to see why this happens. Tools change quickly, approaches evolve, and teams are often asked to demonstrate value faster and faster. In that environment, decisions around data governance, operating models, or long-term ownership can feel heavy and easy to postpone. The problem is that those decisions don’t go away. Instead, they tend to resurface later, usually at exactly the moment when tolerance for delay is lowest or the need is most urgent.

“Change programmes often fail because of people rather than technology.”
Michelle McDaid, founder of The Leading Place

AI initiatives don’t land in a vacuum. They reshape workflows, alter how responsibility is distributed, and change how work actually gets done day to day. When people aren’t brought into that process early, they fill in the gaps themselves, often in ways that undermine trust or adoption long before any technical failure becomes visible.

How success is measured also came under scrutiny. Technical accuracy matters, but it rarely tells the whole story. Reliability, trust, and sustained use turned out to be far better indicators of value than marginal gains in model performance.

“A 75% accurate model delivered reliably is infinitely more valuable than a 95% accurate model stuck in a sandbox.”
Ash Gawthorpe, CTO and co-founder at Ten10

By the end of the discussion, it was hard to escape the conclusion that the gap between AI experimentation and AI in production isn’t mysterious or unsolvable. It looks far more like a familiar delivery challenge playing out in a new context, one where existing weaknesses surface faster and with greater consequence.

The organisations making progress aren’t chasing novelty or promising transformation for its own sake. They’re applying lessons they already understand, confronting operational reality earlier, and treating AI systems as part of the organisation rather than something that exists alongside it.

If you missed the session, you can watch the full recording here.

If you’d like to explore the findings in more detail, you can download the full report here.

Author