Anthropic released Claude Opus 4.7 on April 16, 2026, and the official headline is simple. Prices are unchanged from Opus 4.6. $5 per million input tokens, $25 per million output tokens, up to 90% savings with prompt caching, 50% with batch processing. If you stop reading the release notes there, you will miss the part that matters most to anyone running Opus at scale.
Opus 4.7 ships with a new tokenizer that can produce up to 35% more tokens for the same input text. Your real bill per request can go up even though the rate card did not. This post breaks down the pricing mechanics, runs the math on three realistic workloads, and compares Opus 4.7 to Sonnet 4.6, Haiku 4.5, and the older Opus models, so you can decide whether to migrate, stay, or split traffic across models.
Opus 4.7 keeps the same sticker price as Opus 4.6, 4.5, and 4.1. What changed is how text turns into tokens.
If you already run Opus 4.6 workloads, your most likely outcome is a cost increase between 0% and 35% per request on the same prompts, driven entirely by the tokenizer change. Anthropic did not raise prices. Your bill may still grow.
Tokenization converts raw text and images into the numerical units the model charges for. Opus 4.7 uses a new tokenizer that Anthropic says contributes to the model’s accuracy and instruction-following gains. The tradeoff is density. The same paragraph of English prose, the same Python function, the same JSON payload, can break into more tokens in 4.7 than it did in 4.6. The public estimate is a 1.0x to 1.35x multiplier, with the upper end showing up most often on code, structured data, and non-English text.
Three practical implications:
Before migrating a production workload, replay real traffic side by side and measure the effective cost delta. Do not trust the 35% ceiling as a flat estimate, and do not trust 0% either.
The full Claude model family, priced for comparison:
|
Model |
Input ($/1M) |
Output ($/1M) |
Context |
Best for |
|---|---|---|---|---|
|
Claude Opus 4.7 |
$5 |
$25 |
1M tokens |
Frontier coding, agents, high-res vision |
|
Claude Opus 4.6 |
$5 |
$25 |
1M tokens |
Still-capable coding, lower effective cost/request |
|
Claude Sonnet 4.6 |
$3 |
$15 |
1M tokens |
Default for most production inference |
|
Claude Haiku 4.5 |
$1 |
$5 |
200K tokens |
High-volume, low-latency, simpler tasks |
Two things jump out. Sonnet 4.6 is 40% cheaper per input token and 40% cheaper per output token than Opus, and for most production inference (classification, RAG responses, content generation, basic tool use) it remains the right default. Opus 4.7 is a premium model priced for a specific use case: autonomous coding agents, long-horizon tasks, and work where quality differentiates revenue. Haiku 4.5 is 5x cheaper than Opus on both input and output, and remains the right call for extraction, routing, or moderation at volume.
These numbers are illustrative, not quotes. They exist to show the shape of the cost surface, which now has three axes: sticker price (flat), tokenizer density (up to +35%), and discount tier (caching or batch).
For RAG, stay on Sonnet unless quality evaluation shows a clear Opus-justified lift. Most teams overpay by defaulting to Opus here.
Batch is the single biggest discount available for teams that can tolerate minutes-to-hours latency.
The 35% tokenizer penalty is mostly recoverable. Cache reads are priced at roughly 10% of the standard input rate, which means a workload with long, stable system prompts or reused document context can absorb the tokenizer change and still come out ahead versus naive usage. Two patterns pay off consistently:
Batch processing stacks the 50% discount on top of the standard rate and removes rate-limit pressure from real-time traffic. If you run nightly summarization, backfills, evaluation sweeps, red-team runs, or anything where a minutes-to-hours SLA is acceptable, route it through the Batch API.
Opus 4.7 is available on the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry, plus the consumer Claude apps and GitHub Copilot (replacing Opus 4.5 and 4.6 in the model picker over the coming weeks). Your Opus rate limit is pooled across Opus 4.7, 4.6, 4.5, 4.1, and 4, so adding 4.7 traffic will not bypass your existing quota. Plan a gradual cutover.
Migration checklist:
Upgrade if you run autonomous coding agents, vision-heavy workflows (4.7 triples the pixel budget to 3.75 MP, meaningful for UI and document understanding), or tasks where 4.6 currently leaves quality on the table. The SWE-bench Pro jump from 53.4% to 64.3% is not cosmetic. CursorBench went from 58% to 70%. These gains show up on real production tasks, not just benchmarks.
Stay on 4.6, or move to Sonnet 4.6, if your workload is cost-sensitive, you have already tuned prompts for 4.6 density, or your evals do not show a clear quality delta that justifies the effective cost increase. The honest test: if your product’s revenue per call is under $0.50 and Opus 4.7 lifts quality by 3 percentage points, the extra cost is almost certainly worth it. If per-call revenue is much lower and the lift is marginal, it probably is not.
Per-token prices are identical. Because Opus 4.7 uses a new tokenizer, the same text can produce up to 35% more tokens, which means the effective cost per request can rise even though the rate card did not change.
$5 for input and $25 for output. Prompt caching offers up to 90% savings on cache reads, and batch processing offers a 50% discount on async workloads.
No. Sonnet 4.6 is $3 input and $15 output, 40% cheaper per token than Opus. For most production inference, Sonnet remains the cost-effective default.
Yes. Cache reads are still discounted by up to 90%, and caching is the most reliable way to offset the tokenizer change.
Anthropic has not published a sunset date for Opus 4.6 as of the 4.7 launch. Opus rate limits are pooled across versions, so you can mix traffic during migration.
SWE-bench Pro jumped from 53.4% to 64.3%, CursorBench from 58% to 70%, vision resolution tripled to 3.75 MP, and long-context retrieval improved. Context window stays at 1M tokens.
Anthropic kept the rate card stable on purpose, and that is a gift to anyone building a budget. But “pricing unchanged” is not the same as “cost unchanged.” The 35% tokenizer ceiling is the real pricing story for Opus 4.7. Measure it on your own traffic before you migrate, lean hard on caching and batch to claw back the difference, and be honest about which workloads actually need a frontier model at all. For many teams, the right answer after reading this is not “upgrade to 4.7,” it is “move half the traffic to Sonnet.”
• Introducing Claude Opus 4.7 – Anthropic
• What’s new in Claude Opus 4.7 – Claude API Docs