Why do AI pilots fail?

Most AI pilots fail because leaders target tasks where AI may create organisational value, then assume the person doing the work will experience enough user surplus to use it again. That assumption often fails. The tool can be useful to the organisation and still be costly, risky, or pointless to the individual.

Why isn't AI delivering ROI?

AI often shows no return when activity concentrates on visible pilots, demos, and experiments rather than tasks where measurable task value and user surplus line up. Adoption is not transformation. ROI appears only when repeated use changes cycle time, quality, cost, risk, decision speed, revenue, or another real operating measure.

How do you get AI adopted across a team?

You get AI adopted by starting with tasks that score high on both axes: AI materially improves the task, and the person doing it has a clear reason to use the tool again. Then you instrument repeated use and real outcomes rather than counting licences, logins, or enthusiasm.

Is there a framework for AI adoption?

Yes. The Willingness Gap is a 2x2 diagnostic. The vertical axis is task value from AI: does AI measurably improve this specific task? The horizontal axis is user surplus: does the person doing the work experience enough net benefit to use it again? The four quadrants are Compounding Adoption, the Willingness Gap, AI Theatre, and Correctly Left Alone.

How do you measure AI adoption?

Measure both axes separately. Task value can be measured through cycle time, quality, error reduction, risk reduction, decision speed, cost, revenue, or conversion impact. User surplus can be measured through voluntary repeat use, return without prompting, time to first successful use, prompt or template reuse, workaround rate, qualitative friction, and whether users would miss the tool.

Why don't employees use the AI tools they're given?

Employees ignore AI tools when the tool helps the organisation but gives them little or negative user surplus. It may cost them time, add review burden, expose them to blame, reduce autonomy, or make it harder to claim the result. A tool can be technically useful and still be a bad exchange for the person asked to use it.

Where does AI actually add value in a company?

AI adds value in specific tasks where it measurably improves the work and the person doing the work has enough user surplus to repeat the behaviour. That overlap is the Compounding Adoption quadrant. Value outside that overlap is either theoretical, theatrical, or not worth pursuing yet.

How do you make AI stick in an organisation?

You make AI stick by closing the Willingness Gap: select tasks where AI creates measurable task value, design the workflow so the user captures real surplus, and measure repeated behaviour. Start where both axes are high before trying to force adoption into harder terrain.

The Willingness Gap

The Willingness Gap is the distance between where AI genuinely helps a task and where the person doing it experiences enough user surplus to use it again. Most AI adoption fails not because the tool has no value, but because value to the organisation and value to the user are different things.

I coined the term to name a failure I kept watching repeat. A company buys the tool, maps where it could help, mandates the rollout, and then waits for a return that never arrives. The tool works. The pilot runs. Nothing changes. What broke was never only the technology. It was the quiet assumption that value on a slide would become repeated behaviour at the desk.

What the Willingness Gap is

The unit of analysis is the task. Not the company. Not the department. Not the tool. A model can be powerful in general and still fail in the specific workflow where it lands.

Adoption depends on two things being true at once. First, AI has to create task value: a measurable improvement in the work after operational cost, risk, and governance friction are counted. Second, the person doing that task has to capture user surplus: enough felt net benefit to choose the tool again when nobody is watching.

The gap opens when those two come apart. High task value, low user surplus: a genuinely useful tool sitting untouched because it makes the individual slower, more exposed, less trusted, or less able to claim the result. You cannot mandate value into existence on the far side of that gap.

The two axes

The framework runs on two plain questions. Both have to be answered for one task at a time before you can predict whether an AI tool will actually get used.

Axis one: task value from AI. Does AI measurably improve this specific task? Good signals include cycle time, quality, error reduction, risk reduction, decision speed, cost, revenue, conversion, or the quality of judgement.

Axis two: user surplus. Does the person doing the work experience enough net benefit to use the tool again? Good signals include voluntary repeat use, return without prompting, time to first successful use, prompt or template reuse, workaround rate, qualitative friction, and whether users would miss the tool if it disappeared.

Most rollouts measure the first axis and silently assume the second. The framework forces the second axis into view.

The four quadrants

Cross task value with user surplus and you get four positions. The point is not to produce a prettier chart. It is to stop treating every AI use case as if it has the same adoption physics.

Compounding Adoption, the Sweet Spot (high task value, high user surplus). AI improves the task and the person captures enough benefit to repeat the behaviour. Adoption is self-sustaining here. It sticks without a mandate because the work gets better and the worker feels the gain.

The Willingness Gap (high task value, low user surplus). The organisation can see value, but the user experiences cost. The tool may reduce company expense, improve reporting, or create management visibility, while the individual gets more review burden, less autonomy, more risk, or no credit. This is where "we bought it and nothing changed" comes from.

AI Theatre (low task value, high user surplus). People enjoy the tool, signal modernity, or comply with a visible mandate, but the task does not materially improve. There is energy, content, workshops, demos, and usage. There is not much operating leverage.

Correctly Left Alone (low task value, low user surplus). AI adds little and no one has a reason to reach for it. This is not failure. It is judgement. Naming this quadrant stops teams forcing AI into corners where neither axis justifies the effort.

	Low user surplus	High user surplus
High task value	The Willingness Gap (useful, unused)	Compounding Adoption (adoption sticks)
Low task value	Correctly Left Alone (leave it)	AI Theatre (activity, little return)

How to measure it

Score each task on both axes from evidence, not enthusiasm. A simple 1 to 5 score is enough at first, provided the two scores stay separate.

Task value: measure cycle time, quality, error rate, risk reduction, decision speed, cost, revenue, conversion, or quality of judgement.
User surplus: measure voluntary repeat use, return without prompting, time to first successful use, reuse of prompts or templates, workaround rate, qualitative friction, and whether users would miss the tool.
Adoption: measure repeated behaviour on a specific task, plus the outcome that behaviour changes. Licences, logins, and training attendance are weak signals unless they connect to changed work.

User surplus can be positive, neutral, or negative. Positive surplus means the person would choose the tool again because it saves effort, improves judgement, reduces risk, gives them more control, or helps them produce a result they can stand behind. Negative surplus means the tool may help the organisation while making the individual slower, more exposed, or less trusted.

Why it matters

The external evidence rhymes with this. MIT NANDA's 2025 GenAI Divide report found that only a small share of enterprise AI efforts were translating into measurable business impact, despite widespread experimentation. Goldman Sachs chief economist Jan Hatzius made a parallel macro point in February 2026: AI investment had contributed far less to 2025 US GDP growth than the market story implied.

The diagnosis is not that AI does not work. It is that adoption is not transformation. A tool can be technically capable, visibly deployed, and widely discussed without changing the repeated behaviour that moves an operating metric.

The Willingness Gap explains why. Spending has gone into the task-value axis. Rollout decks have assumed the user-surplus axis. The return only shows up when both are true in the same task.

The key insight

Most failed AI adoption ignores the user-surplus axis. Leaders map where AI adds value, then assume value alone drives use. It does not.

Adoption only sticks where the tool's value to the task and the individual's reason to use it both run high. The commonest failure is the mismatch: high task value paired with low user surplus, a tool that is useful to the organisation and a bad exchange for the person, sitting idle while everyone wonders where the return went.

You close the Willingness Gap by measuring what people can actually do with AI, not what management declares they should be doing with it. That is the move: demonstration over declaration. Watch the task, not the memo. The organisations that get a return will stop counting licences and start counting the moments where the tool changed how the work got done, then design the workflow so those moments have a reason to repeat.

The question worth sitting with is not "where could AI help us?" You already know the answer to that one. It is "where does the person doing the work capture enough surplus to pick it up again?" Answer that honestly, and most of your adoption problem was never a technology problem at all.

The Willingness Gap

In Brief

What this argues

Why it matters

Key mechanism

What the Willingness Gap is

The two axes

The four quadrants

How to measure it

Why it matters

The key insight

The questions people actually ask.

Why do AI pilots fail?

Why isn't AI delivering ROI?

How do you get AI adopted across a team?

Is there a framework for AI adoption?

How do you measure AI adoption?

Why don't employees use the AI tools they're given?

Where does AI actually add value in a company?

How do you make AI stick in an organisation?