The Willingness Gap is the distance between where AI genuinely helps a task and where the person doing it experiences enough user surplus to use it again. Most AI adoption fails not because the tool has no value, but because value to the organisation and value to the user are different things.
I coined the term to name a failure I kept watching repeat. A company buys the tool, maps where it could help, mandates the rollout, and then waits for a return that never arrives. The tool works. The pilot runs. Nothing changes. What broke was never only the technology. It was the quiet assumption that value on a slide would become repeated behaviour at the desk.
What the Willingness Gap is
The unit of analysis is the task. Not the company. Not the department. Not the tool. A model can be powerful in general and still fail in the specific workflow where it lands.
Adoption depends on two things being true at once. First, AI has to create task value: a measurable improvement in the work after operational cost, risk, and governance friction are counted. Second, the person doing that task has to capture user surplus: enough felt net benefit to choose the tool again when nobody is watching.
The gap opens when those two come apart. High task value, low user surplus: a genuinely useful tool sitting untouched because it makes the individual slower, more exposed, less trusted, or less able to claim the result. You cannot mandate value into existence on the far side of that gap.
The two axes
The framework runs on two plain questions. Both have to be answered for one task at a time before you can predict whether an AI tool will actually get used.
Axis one: task value from AI. Does AI measurably improve this specific task? Good signals include cycle time, quality, error reduction, risk reduction, decision speed, cost, revenue, conversion, or the quality of judgement.
Axis two: user surplus. Does the person doing the work experience enough net benefit to use the tool again? Good signals include voluntary repeat use, return without prompting, time to first successful use, prompt or template reuse, workaround rate, qualitative friction, and whether users would miss the tool if it disappeared.
Most rollouts measure the first axis and silently assume the second. The framework forces the second axis into view.
The four quadrants
Cross task value with user surplus and you get four positions. The point is not to produce a prettier chart. It is to stop treating every AI use case as if it has the same adoption physics.
Compounding Adoption, the Sweet Spot (high task value, high user surplus). AI improves the task and the person captures enough benefit to repeat the behaviour. Adoption is self-sustaining here. It sticks without a mandate because the work gets better and the worker feels the gain.
The Willingness Gap (high task value, low user surplus). The organisation can see value, but the user experiences cost. The tool may reduce company expense, improve reporting, or create management visibility, while the individual gets more review burden, less autonomy, more risk, or no credit. This is where "we bought it and nothing changed" comes from.
AI Theatre (low task value, high user surplus). People enjoy the tool, signal modernity, or comply with a visible mandate, but the task does not materially improve. There is energy, content, workshops, demos, and usage. There is not much operating leverage.
Correctly Left Alone (low task value, low user surplus). AI adds little and no one has a reason to reach for it. This is not failure. It is judgement. Naming this quadrant stops teams forcing AI into corners where neither axis justifies the effort.
| Low user surplus | High user surplus | |
|---|---|---|
| High task value | The Willingness Gap (useful, unused) | Compounding Adoption (adoption sticks) |
| Low task value | Correctly Left Alone (leave it) | AI Theatre (activity, little return) |
How to measure it
Score each task on both axes from evidence, not enthusiasm. A simple 1 to 5 score is enough at first, provided the two scores stay separate.
- Task value: measure cycle time, quality, error rate, risk reduction, decision speed, cost, revenue, conversion, or quality of judgement.
- User surplus: measure voluntary repeat use, return without prompting, time to first successful use, reuse of prompts or templates, workaround rate, qualitative friction, and whether users would miss the tool.
- Adoption: measure repeated behaviour on a specific task, plus the outcome that behaviour changes. Licences, logins, and training attendance are weak signals unless they connect to changed work.
User surplus can be positive, neutral, or negative. Positive surplus means the person would choose the tool again because it saves effort, improves judgement, reduces risk, gives them more control, or helps them produce a result they can stand behind. Negative surplus means the tool may help the organisation while making the individual slower, more exposed, or less trusted.
Why it matters
The external evidence rhymes with this. MIT NANDA's 2025 GenAI Divide report found that only a small share of enterprise AI efforts were translating into measurable business impact, despite widespread experimentation. Goldman Sachs chief economist Jan Hatzius made a parallel macro point in February 2026: AI investment had contributed far less to 2025 US GDP growth than the market story implied.
The diagnosis is not that AI does not work. It is that adoption is not transformation. A tool can be technically capable, visibly deployed, and widely discussed without changing the repeated behaviour that moves an operating metric.
The Willingness Gap explains why. Spending has gone into the task-value axis. Rollout decks have assumed the user-surplus axis. The return only shows up when both are true in the same task.
The key insight
Most failed AI adoption ignores the user-surplus axis. Leaders map where AI adds value, then assume value alone drives use. It does not.
Adoption only sticks where the tool's value to the task and the individual's reason to use it both run high. The commonest failure is the mismatch: high task value paired with low user surplus, a tool that is useful to the organisation and a bad exchange for the person, sitting idle while everyone wonders where the return went.
You close the Willingness Gap by measuring what people can actually do with AI, not what management declares they should be doing with it. That is the move: demonstration over declaration. Watch the task, not the memo. The organisations that get a return will stop counting licences and start counting the moments where the tool changed how the work got done, then design the workflow so those moments have a reason to repeat.
The question worth sitting with is not "where could AI help us?" You already know the answer to that one. It is "where does the person doing the work capture enough surplus to pick it up again?" Answer that honestly, and most of your adoption problem was never a technology problem at all.