Anthropic Just Released Claude Opus 4.5 - Here's How It Stacks Up Against Other Leading Models

3 hours ago

Follow ZDNET: Add america arsenic a preferred source connected Google.

ZDNET's cardinal takeaways

Anthropic's caller AI model, Claude Opus 4.5, has arrived.
The exemplary reportedly excels astatine imaginative problem-solving.
It besides excels astatine agentic tasks, according to Anthropic.

AI startup Anthropic released its latest model, Claude Opus 4.5, connected Monday, describing it successful a institution blog station arsenic "a measurement guardant successful what AI systems tin do, and a preview of changes to really activity gets done."

Also: ChatGPT's caller shopping investigation instrumentality is fast, fun, and free - but tin it out-shop me?

The caller exemplary outperforms different industry-leading apps specified arsenic Google's Gemini 3 Pro and OpenAI's GPT-5.1 connected coding tasks, according to Anthropic.

The institution besides wrote that nan exemplary "scored higher than immoderate quality campaigner ever" connected nan "notoriously difficult" exam fixed to prospective engineering employees. The consequence "raises questions astir really AI will alteration engineering arsenic a profession," Anthropic wrote successful its blog post. A type of Gemini 2.5 besides precocious scored apical marks successful nan International Collegiate Programming Contest (ICPC), an internationally renowned coding competition.

Claude Opus 4.5 outperforms erstwhile Anthropic models successful vision, reasoning, and math, according to nan company, and achieves state-of-the-art capacity successful tasks specified arsenic agentic instrumentality usage and machine use.

Anthropic added successful its blog station that its latest exemplary reached caller heights successful its expertise to logic done and flexibly accommodate to analyzable problems.

Also: Anthropic's caller warning: If you train AI to cheat, it'll hack and sabotage too

In 1 trial scenario, nan exemplary had to service arsenic an automated hose supplier helping a customer who had requested to alteration their basal system flight. Since specified a alteration isn't allowed by nan fictitious airline, nan trial is designed to measurement really good nan automated supplier denies nan petition and handles nan disgruntled customer. Claude Opus 4.5, however, recovered a imaginative loophole: It first changed nan customer's cabin, past changed their flight, since specified a alteration was allowed for non-basic system flights.

"This would costs much money, but it's a morganatic way wrong nan policy," Claude Opus 4.5 said during nan transaction, according to an image provided by Anthropic successful nan caller blog post.

"The benchmark technically scored this arsenic a nonaccomplishment because Claude's measurement of helping nan customer was unanticipated," Anthropic wrote. "But this benignant of imaginative problem solving is precisely what we've heard astir from our testers and customers -- it's what makes Claude Opus 4.5 consciousness for illustration a meaningful measurement forward."

Claude Opus 4.5 scored amended than its predecessors and different frontier models connected exhibiting "concerning behavior," which Anthropic defines arsenic "both practice pinch quality misuse and undesirable actions that nan exemplary takes astatine its ain initiative."

Also: Claude tin merge pinch Excel now - and gets 7 caller connectors

Available now connected nan Claude apps, API, and done nan 3 awesome unreality platforms (Azure, Amazon Web Services, and Google Cloud), Claude Opus 4.5 is priced astatine $5/$25 per cardinal tokens.

Anthropic reported a $183 cardinal valuation successful September pursuing its latest backing round, a fig mostly made imaginable by Claude's fame among endeavor customers. The institution besides announced earlier this period that it would invest $50 cardinal successful its ain information centers crossed nan US to powerfulness nan training of caller AI models.