Anthropic released Claude Opus 4.7 yesterday, and within hours, the internet couldn’t agree on whether it got smarter or dumber. An AMD senior director called it unusable. GitHub’s changelog praised its multi-step task performance. Reddit is split between “serious regression” and “best model I’ve ever used.”

I’ve been using Opus 4.6 daily in my marketing workflow for months. I just spent 24 hours on Opus 4.7. Here’s my honest take — and more importantly, what I think marketers should actually do about it.

The Benchmarks Say It’s Better

Let’s start with what the numbers show, because they’re not ambiguous:

SWE-bench Verified: 87.6% (up from 80.8% on Opus 4.6) — a 7-point jump that puts it ahead of Gemini 3.1 Pro
SWE-bench Pro: 64.3% (up from 53.4%) — a 10-point improvement, now ahead of both GPT-5.4 and Gemini on multi-language coding
MCP-Atlas (tool use): 77.3% — best-in-class for complex multi-turn tool-calling workflows
OSWorld (computer use): 78.0% — up from 72.7%, with 3x vision resolution improvement
GPQA Diamond (reasoning): 94.2% — competitive with GPT-5.4 Pro and Gemini 3.1 Pro

The one area that slipped: BrowseComp (agentic search) dropped from 83.7% to 79.3%. If your agents do a lot of web research, that matters. For everything else, the benchmarks say it improved.

Source: Vellum AI’s benchmark breakdown and Anthropic’s official announcement.

The Users Say It’s Complicated

Here’s where it gets interesting. The benchmarks don’t match the vibe.

A senior director at AMD posted on GitHub that “Claude has regressed to the point it cannot be trusted to perform complex engineering.” Axios reported on this as a sign that Anthropic may be deliberately scaling back compute to manage costs.

On Reddit’s r/ClaudeAI, the top post is titled “Claude Opus 4.7 is a serious regression, not an upgrade.” The consensus in that thread: Anthropic is forcing the model into a low-effort, low-compute default state that produces worse output despite better underlying capabilities.

But then on r/ClaudeCode, someone posted “Opus 4.7 just dropped. It’s smarter. It’s also going to destroy your quota even faster.” They noted SWE-bench jumping from 80.8% to 87.6% and acknowledged the model is genuinely better — but warned that a smarter model doing blind exploration burns tokens faster.

And GitHub’s changelog says Opus 4.7 “delivers stronger multi-step task performance and more reliable agentic execution.”

So which is it? Smarter or dumber?

My Take: It’s Smarter, But It’s Also More Expensive and Less Patient

After 24 hours with Opus 4.7, here’s what I think is actually happening:

The model is better at complex tasks. When I give it a well-defined problem — write this function, refactor this module, debug this error — it nails it faster and more accurately than 4.6. The coding benchmarks aren’t lying.

But it’s worse at simple tasks. It overthinks things. Ask it to write a meta description and it’ll give you three paragraphs of reasoning about why meta descriptions matter before writing one. The “low-compute default state” complaint from Reddit matches my experience — it’s like the model is always in “think hard” mode even when you don’t need it to think hard.

And it’s more expensive. Pricing stayed the same ($5/$25 per million tokens), but the model uses more tokens per interaction because it’s “thinking” more. One Reddit user reported a 35% spike in usage in a single prompt session. That’s real money if you’re running automated workflows.

The context window issue is real. Multiple users report Opus 4.7 losing context more frequently than 4.6. If you’re running long conversations or multi-turn agent workflows, this is a problem that benchmarks don’t capture.

What This Means for Marketers

Here’s where I get practical. If you’re a marketer using AI in your workflow — content writing, SEO analysis, ad copy, data crunching — should you upgrade to Opus 4.7 right now?

My Recommendation: Wait 2 Weeks

I know that’s not the exciting answer. Everyone wants to use the newest model immediately. But here’s why I’m waiting:

1. The “new model smell” bias is real. Every time a new model drops, the first 48 hours of reviews are either “this is amazing” or “this is broken.” Neither is reliable. The actual quality signal takes a week or two to stabilize as Anthropic tunes compute allocation and fixes early bugs.

2. Your existing prompts are optimized for 4.6. If you’ve spent months tuning your prompts, system instructions, and workflows for Opus 4.6, switching to 4.7 will break things. Not because 4.7 is worse — because your prompts were calibrated for a different model’s behavior patterns. You’ll need to re-tune.

3. The cost increase is real. If your automated workflows (blog writing, SEO analysis, report generation) are token-budgeted for 4.6 pricing behavior, 4.7’s tendency to “think harder” will blow your budget. I’d rather run 4.6 at known cost for two more weeks than discover my monthly bill doubled.

4. Anthropic will patch it. They always do. Opus 4.5 had a similar “is it better or worse?” launch. Within two weeks, they adjusted the compute allocation and most complaints disappeared. I expect the same pattern here.

What I’m Actually Doing

I’m keeping my production workflows on Opus 4.6 for now. I’ve spun up a separate test environment running 4.7 where I’m running my key workflows side-by-side and comparing output quality, token usage, and reliability.

The test that matters most to me: Can 4.7 write a 2,000-word blog post with the same quality as 4.6 in fewer tokens? If yes, I switch. If it’s better quality but 40% more expensive, I stay on 4.6 for content and use 4.7 only for complex technical tasks.

For Non-Technical Marketers

If you’re using Claude through the web interface or a third-party tool (not the API), you probably won’t notice much difference. The changes are most visible in agentic workflows, coding, and long-context tasks. For writing emails, brainstorming campaigns, and analyzing data — 4.7 will feel roughly the same as 4.6, maybe slightly more verbose.

Don’t panic-switch. Don’t rebuild your workflows. Test it in a sandbox first.

The Bigger Question: Is Anthropic Releasing Too Fast?

Opus 4.5 shipped in November 2025. Opus 4.6 in February 2026. Opus 4.7 in April 2026. That’s three major Opus releases in five months.

Meanwhile, users are reporting that 4.6 was “nerfed” in the weeks before 4.7 dropped — performance degraded as Anthropic presumably shifted compute to prepare for the new model. A Reddit post titled “CLAUDE OPUS 4.6 IS NERFED!!” got 1,700 upvotes and 267 comments.

This is the part that concerns me. Not that 4.7 exists — competition is good, and better models help everyone. But the pattern of degrading the current model to make the new one look better is a trust problem. If I can’t rely on Opus 4.6 to perform consistently because Anthropic might quietly dial it back before a product launch, that affects my business.

I wrote about this dynamic in my earlier piece on Anthropic’s design tool and IPO strategy. The pressure to ship fast — before an expected October IPO at $400-500 billion — creates incentives that don’t always align with user experience.

What I’m Watching Next

Does Anthropic acknowledge the regression reports? So far, radio silence on the AMD post and Reddit complaints.
How fast do they patch? If 4.7 gets a “4.7.1” within two weeks with compute fixes, that’s a good sign. If it stays as-is for a month, that’s a pattern.
Does 4.6 get restored? If users report 4.6 performance returning now that 4.7 is launched, the “nerfing” theory gets confirmed.
The Capybara tier. If Anthropic really builds a tier above Opus at $10-15 per million input tokens (from the earlier leaks), the entire pricing structure of the AI industry shifts.

I’ll update this post as the dust settles. In the meantime, I’m staying on 4.6 for production and testing 4.7 in parallel. That’s the responsible approach for anyone running AI-dependent workflows.

New to Claude or trying to figure out which model to use? Check out my Sonnet 4.6 vs Opus 4.6 comparison for the full breakdown on which model fits which task.

Frequently Asked Questions

When was Claude Opus 4.7 released?

Claude Opus 4.7 was released on April 16, 2026, following Anthropic’s announcement on April 15. It’s available on the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

Is Opus 4.7 better than Opus 4.6?

According to benchmarks, yes — SWE-bench jumped from 80.8% to 87.6%, and tool use improved to best-in-class at 77.3%. But user reports are mixed, with some calling it a regression due to higher token usage and context loss issues. The underlying model is better; the default compute behavior may not be.

How much does Opus 4.7 cost?

Pricing is the same as Opus 4.6: $5 per million input tokens and $25 per million output tokens. However, the model tends to use more tokens per interaction because it “thinks harder” by default, so effective costs may be higher.

Should I switch from Opus 4.6 to 4.7?

If you’re running production workflows, I recommend waiting 2 weeks. Test 4.7 in a sandbox environment first, compare token usage and output quality against your 4.6 baseline, and switch only if the quality improvement justifies any cost increase.

What is Claude Mythos?

Claude Mythos is Anthropic’s most powerful model, described as a “step change” in capability. It’s currently restricted to Project Glasswing partners for cybersecurity research and is not available to the public. Opus 4.7 is not Mythos — it’s the next step in the Opus line.

Is Anthropic releasing models too fast?

Anthropic has shipped three Opus releases in five months (4.5 in November, 4.6 in February, 4.7 in April). Users report that 4.6 was “nerfed” in the weeks before 4.7 launched, suggesting compute was shifted to the new model. Whether this pace is sustainable is an open question — especially with an IPO expected in October 2026.

Is Claude Opus 4.7 Smarter or Dumber? My Honest Take After 24 Hours