← Back to feed
AI

Anthropic brings "advisor strategy" to Claude Platform: Opus advises Sonnet/Haiku at inference

Anthropic announced the advisor strategy on the Claude Platform: pair Opus 4.6 as a planning/critique advisor with Sonnet 4.6 or Haiku 4.5 as the executing model. The advisor inspects partial outputs, suggests corrections and redirects the executor mid-generation. On SWE-bench Multilingual, Sonnet+Opus-advisor scores 2.7 percentage points higher than Sonnet alone, at roughly 1.3x the cost vs 7x the cost of running Opus end-to-end. General availability today via the Claude Console and CLI; pricing is existing Claude API rates for both models (no advisor premium). Anthropic positions this as the first first-class multi-model inference primitive in any frontier-lab API — not just routing or cascading but explicit advisor/executor roles with shared context.

AnthropicClaudeOpusSonnetSWE-benchMulti-Model

Why it matters

Advisor-mode is the first API-level primitive for multi-model inference at a frontier lab — and it's interesting because the economics finally make sense. 2.7pp on SWE-bench Multilingual for 1.3x cost (vs 7x for pure Opus) is exactly the kind of unit economics that lets enterprise buyers say yes. Expect OpenAI and DeepMind to fast-follow with analogous APIs within 90 days; expect evals to shift toward reporting advised-vs-unadvised numbers separately. Longer term, this normalizes a pattern where models are graded per-dollar rather than per-token, which is what the enterprise market actually wants.

Impact scorecard

7.06/10
Stakes
7.0
Novelty
7.5
Authority
9.0
Coverage
5.5
Concreteness
8.5
Social
8.0
FUD risk
2.0
Coverage10 outlets · 3 tier-1
@AnthropicAI, The Verge, TechCrunch, The Information, SemiAnalysis, Latent Space podcast
X / Twitter4,600 mentions
@claudeai · 5,200 likes
@bcherny · 2,100 likes
Reddit2,700 upvotes
r/ClaudeAI
r/ClaudeAI, r/LocalLLaMA, r/MachineLearning

Trust check

high

Anthropic is a primary vendor source announcing its own product, so the facts (availability, pricing model, advisor/executor architecture) are high-confidence. The 2.7pp SWE-bench delta is vendor-reported — credible but not independently replicated yet; published methodology on Anthropic's blog. Low FUD risk but watch for independent eval teams (Latent Space, Artificial Analysis) confirming or contradicting the numbers in the next 2 weeks.

Primary source ↗