Issue #51 — Fast-Moving Technology

Dear Reader,

In 2024, enterprise AI business cases were routinely built on an assumption that looked reasonable at the time: inference costs would keep falling. Each generation cheaper per token, each model more capable for the same price. By mid-2026 the assumption has reversed. Gemini 3.5 Flash costs roughly five times more per task than its predecessor when token consumption is counted. GPT-5.5 and Claude Opus 4.7 both came in above their predecessors on cost-per-task. A project whose unit economics were modelled on 2024 inference pricing may be off by a factor of three or more on its cost assumptions alone — and that is one layer. Organisations that built their AI knowledge architecture around dedicated vector databases are holding a category the market has since absorbed into multimodel platforms. LangChain, the dominant orchestration framework two years ago, reorganised around LangGraph with breaking API changes that required architectural rewrites across projects that had standardised on it. None of these shifts came with a deprecation notice.

A roadmap that planned for none of this was not necessarily poorly designed. It specified the wrong things with precision — which vendor hosts the integration, which model runs each step — and left the things that actually matter vague: which business outcome the programme will move, and how the architecture will hold when the outer layers change beneath it.

The six-month cliff

In 2025, Viktor Malyi published a formulation that has since made the rounds in CDO circles: a planning horizon longer than six months means you are committing resources based on a world that will not exist when you try to execute. He is right about one layer. Nasscom’s analysis of why 2025 AI roadmaps stalled is more precise about which layer: the programmes that faltered had typically fixed two things that should not have been fixed — the model they would run on and the vendor stack that would host it. When both shifted mid-year — deprecation notices on the models, pricing reversals on the platforms — the roadmaps required rebuilding from the second layer down. An 18-month horizon is fine for governance architecture and business outcomes. Applied to model and vendor commitments, it is the wrong cadence entirely.

The RAG pattern has shifted from a retrieval technique into what practitioners now call a context engine, with knowledge graphs replacing flat vector search in governed production deployments — another architectural decision that looked stable and turned out not to be.

What an AI transformation plan actually contains

A single planning horizon applied across all layers is the structural error. Each layer ages at a different rate — and treating them as if they share one clock is where plans break.

A data centre buildout or a manufacturing CapEx plan can fix costs and specifications 18 months out because the relevant inputs — hardware prices, headcount, licensing — change slowly. An AI programme contains layers that move on completely different cycles. Some must be fixed before any build begins; one is designed explicitly to flex. The structural error is applying the same rigour uniformly across all of them.

A transformation plan has four components.

The economic commitment is fixed and precise: which P&L line the programme will move, by how much, measured through which operational metrics — and the cost envelope within which that outcome holds. Inference costs have proven volatile in both directions, so the business case is not a point estimate but a range: base case plus a stress test at two to three times current inference cost, with sensitivity modelled on the automation rate that most directly drives the economics. The target and the conditions under which it is viable are one object. If the outcome cannot be stated in one sentence, it will not survive the first implementation change.

The operating design is fixed before any deployment goes live: an AI-ready process specification that classifies each step by decision type, error cost, and the human oversight model required at that step. Not a map of the current state — the specification of the target. Where AI runs autonomously, where a human must review, where a human must decide. Operating design decisions made after the technical build has started cost significantly more to change.

The regulatory and audit obligations are fixed: personal data protection requirements, audit trail specifications, kill-switch conditions, and the timeline of applicable regulatory requirements. Governance infrastructure retrofitted after deployment is consistently weaker and more expensive than governance designed in from the start.

The substitutable runtime is designed once, explicitly for flexibility. A gateway or routing layer handles model selection at runtime, optimising continuously against current cost and quality parameters. The programme plan does not name a model — it specifies the integration architecture and the gateway’s optimisation parameters. Which models sit behind it is an operational decision the gateway runs, not a strategic commitment the plan makes.

More than 40% of agentic AI projects will be scrapped by end of 2027 for unclear business value, per Gartner. In most cases the original business case was built around implementation choices that shifted.

The first AI programme an organisation runs establishes the planning template every subsequent one will follow. Organisations that start by focusing on the expected business outcome, and treating the implementation approach as flexible, carry that discipline forward. Ones that start by committing to a specific vendor and model version carry that habit forward too.

Why the structure works

The four-component structure is resilient because each component is chosen specifically for being independent of what moves fastest. A business outcome does not become wrong because a model was deprecated. A governance architecture does not need rebuilding because a new vendor entered the market. Larridin’s 2026 analysis of enterprise AI transformation programmes describes the same logic as core-and-orbit: a stable core the outer layer of tools and integrations can be revised around without touching.

An organisation that has designed its oversight rules around decision type and error cost can swap the underlying model without rebuilding the oversight layer. One whose rules are expressed in model-specific terms — confidence thresholds, output format checks — has to revalidate and recalibrate every time the model changes.

The 90-day loop

What replaces the 18-month roadmap is a planning cycle that runs continuously on a 90-day cadence — each iteration updating the outer orbit while the stable commitments hold. Ninety days aligns with quarterly business and board reporting cycles, so the plan revision and the P&L review happen on the same clock. It is also close enough to the actual pace of change in model pricing and vendor options to catch a bad assumption before it compounds, while long enough to execute something and observe results.

Each cycle does four things. It assesses what is currently deployed — quality level, automation rate, and what has changed in the underlying cost or capability parameters since the previous cycle. That assessment drives an outer orbit review: does the current vendor contract still reflect the best available cost-quality trade-off; do the gateway’s routing parameters need updating; has a capability shift opened a use case that wasn’t economically viable last quarter? From that review, it re-prioritises the next quarter’s implementation work. It then executes, and measures against the P&L line the programme was built to move — not agent counts or deployment numbers, but the operational metric the business outcome is actually anchored to.

The board artefact this produces shows governance architecture and business outcomes as fixed commitments, and specific implementations as current choices with their known revision triggers. The CDO presenting it is in a different position from the one defending a stale roadmap: the plan was designed to absorb change, and here is what this quarter’s version looks like.

This is also how the work compounds. The first cycle establishes the process baseline and the measurement chain. The second refines both based on what production reveals. By the fourth or fifth cycle, adjustments are informed by actual data, not forced by a roadmap invalidation the organisation did not plan for.

The Briefing

At its Build developer conference on 2–3 June 2026, Microsoft announced the MAI model family — its own generative AI models developed to reduce enterprise dependence on OpenAI and lower inference costs for developers. MAI-Thinking-1, currently in private preview, carries a specific attribute Microsoft has pushed hard: commercial licensing provenance, meaning the model was trained without distillation on licensed data, which is designed to address the legal risk concerns that have been slowing enterprise procurement in regulated sectors. Azure Foundry now hosts over 12,000 models. For enterprise AI planning, the relevant signal is less which model to pick than what the announcement confirms: the vendor landscape that organisations were planning around 12 months ago has changed substantially, and locking any single platform as a strategic commitment is a bet that requires revisiting.

IBM’s Think 2026 conference, held this week, positioned the company around an operating layer for enterprise AI — linking agent orchestration, real-time data, operations and sovereignty tools explicitly in response to AI pilots that are struggling to scale. IBM Sovereign Core reached general availability: software for building AI-ready sovereign environments that lets enterprises verify their control end-to-end. Arvind Krishna’s framing was direct: hybrid as a durable architecture, AI-first as an operating model change. The “sovereign” framing resonates differently in markets where data localisation and government oversight of AI infrastructure are live concerns — which includes most of the central European enterprise market IBM is speaking to.

Questions for your leadership team

When did we last review our gateway configuration and vendor contracts as a deliberate quarterly decision, rather than in response to a deprecation notice or a pricing surprise? If the answer is “never” or “when forced to,” what does that say about whether the routing layer actually exists as described — or whether individual model selections are still being made by hand?
If someone on the leadership team asked tomorrow what specific P&L line our AI programme is responsible for moving — and by how much, by when — could we answer in one sentence? Or is the programme currently defined by what it is building rather than what it is changing?
For each AI deployment currently live or in development: would the human oversight model we have designed survive a tool or model replacement? How long, in practice, would a model swap take from decision to production today?
Looking at AI initiatives we have cancelled or stalled in the last 12 months — were they cancelled because the technology did not work, or because the original case was built on assumptions that shifted underneath it? What would a 90-day planning loop have changed?

Summary

Detailed 18-month roadmaps break when the technology moves. The outcome commitment is fixed and precise; the implementation is provisional and revisable. Planning in a fast-moving environment is less about shortening the horizon than about knowing which commitments the plan is actually making — and keeping the rest open to quarterly revision.

Stay balanced, Krzysztof Goworek

Krzysztof Goworek is founder of Quintant — AI advisory that gets enterprises from experiment to production value.