📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google argues that the most significant change in software development is not the AI model itself but how developers harness and verify it. The model accounts for only 10% of behavior; the rest is in configuration and context engineering. This shift impacts how organizations should invest in AI tools and practices.

Google’s latest whitepaper, “The New SDLC With Vibe Coding,” emphasizes that the most crucial aspect of AI-driven software development is not the size of the model but the harness, configuration, and context engineering surrounding it. This insight challenges the common focus on model advancements and suggests a fundamental shift in how organizations should approach AI integration.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that 85% of professional developers now use AI coding agents regularly, with 51% using them daily. It highlights that roughly 41% of all new code is AI-generated. Crucially, the authors argue that the model itself accounts for only about 10% of the system’s behavior, while the harness — including prompts, tools, rules, and context — makes up the remaining 90%.

Concrete examples include a public benchmark where changing only the harness moved an agent from outside the top 30 to the top 5, and another where tweaking prompts and middleware improved performance by 13.7 points. The paper emphasizes that failures in AI agents are often due to configuration issues, not model limitations, placing strategic importance on how developers build and manage the surrounding infrastructure.

At a glance
reportWhen: published early 2026
The developmentGoogle’s new whitepaper highlights that the core of AI development is shifting from model size to harness and context engineering, redefining best practices in SDLC.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development and Investment Strategies

This shift means organizations should prioritize harness design, context management, and verification over merely adopting the latest models. The whitepaper suggests that cost efficiency and system robustness depend more on configuration and scaffolding than on model size. As a result, the traditional focus on model performance must give way to skills in context engineering, tooling, and verification. This could reshape vendor choices, internal development priorities, and long-term AI strategy, emphasizing durability and control over raw model capabilities.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI Development Practices and Industry Trends

Prior to this, the AI community largely concentrated on improving model architectures and increasing sizes, often equating bigger models with better performance. However, recent developments, including the widespread adoption of AI coding agents, have shown that configuration, scaffolding, and context management are critical for effective AI deployment. The whitepaper builds on earlier insights from Andrej Karpathy and others, extending the idea that prompt engineering alone is insufficient without robust scaffolding.

By early 2026, the industry has seen a surge in AI usage, with more organizations recognizing that costs and failures are driven by configuration, not just the model’s raw power. This aligns with broader trends toward modular, maintainable AI systems that can be tailored and verified at scale.

“The behavior you experience in AI tools is dominated by scaffolding you can build, own, and improve, not the frontier model itself.”

— Addy Osmani

Amazon

AI configuration management software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unclear Aspects of Implementation and Industry Adoption

While the whitepaper provides compelling evidence that harness and context are dominant, it remains to be seen how quickly organizations will shift their practices and investments accordingly. Specific guidelines for best practices, tooling standards, and training are still emerging, and the long-term impact on model development cycles is uncertain. Additionally, the extent to which this approach can be standardized across different industries and use cases is not yet clear.

Amazon

AI testing and evaluation tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Organizations and Developers in AI Strategy

Organizations should evaluate their current AI workflows, emphasizing harness design, context management, and verification processes. Investment in tooling, training, and best practices for configuration will be critical. Industry groups and vendors are likely to develop standards and frameworks to support this shift. Monitoring early adopters’ experiences and participating in collaborative efforts will help shape effective strategies moving forward.

Designing Instruction with Generative AI: 24/7 Support for Optimizing Teaching and Learning

Designing Instruction with Generative AI: 24/7 Support for Optimizing Teaching and Learning

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model size less important than the harness?

The whitepaper shows that behavior and performance are mostly determined by how the AI is configured, scaffolded, and verified. The model itself is only about 10% of the equation; the rest comes from the surrounding infrastructure.

How does this change current AI development practices?

It shifts focus from solely improving models to building robust scaffolding, context management, and verification systems. Teams will need to develop skills in configuration, tooling, and system design.

What are the risks of focusing less on models?

Overemphasizing models could lead to neglecting system robustness, security, and cost efficiency. Proper harness and context management are essential to prevent failures and control costs.

Will this approach reduce AI development costs?

In the long run, yes. The whitepaper suggests that disciplined engineering with proper harnessing can reduce operational costs by minimizing token waste and failure rates.

What should companies do now to adapt?

Companies should invest in developing and standardizing harnessing strategies, tooling, and verification processes and train teams in context engineering to stay competitive.

Source: ThorstenMeyerAI.com

You May Also Like

The Roblox Cheat That Broke Vercel.

A Roblox auto-farm script downloaded by an employee exploited OAuth trust, causing a major breach at Vercel. The incident highlights security risks from seemingly harmless decisions.

The Free-Download Question: When Running Your Own Model Actually Beats Paying

Analysis of the rising viability of self-hosted AI models versus API costs, highlighting recent technical and economic shifts as of mid-2026.

Évian and the Fallout: What Europe Actually Wants From Amodei, Hassabis, and Altman

Europe pushes for reliable access, sovereignty, and safety in AI amid US export controls and global AI governance debates at G7 summit.