Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

Claude incessantly overstated findings and infrequently fabricated information throughout autonomous operations, claiming to have obtained credentials that didn’t work or figuring out crucial discoveries that proved to be publicly accessible data. This AI hallucination in offensive safety contexts offered challenges for the actor’s operational effectiveness, requiring cautious validation of all claimed outcomes. This stays an impediment to completely autonomous cyberattacks.

How (Anthropic says) the assault unfolded

Anthropic stated GTG-1002 developed an autonomous assault framework that used Claude as an orchestration mechanism that largely eradicated the necessity for human involvement. This orchestration system broke complicated multi-stage assaults into smaller technical duties equivalent to vulnerability scanning, credential validation, information extraction, and lateral motion.

“The structure included Claude’s technical capabilities as an execution engine inside a bigger automated system, the place the AI carried out particular technical actions primarily based on the human operators’ directions whereas the orchestration logic maintained assault state, managed section transitions, and aggregated outcomes throughout a number of classes,” Anthropic stated. “This strategy allowed the risk actor to realize operational scale sometimes related to nation-state campaigns whereas sustaining minimal direct involvement, because the framework autonomously progressed by means of reconnaissance, preliminary entry, persistence, and information exfiltration phases by sequencing Claude’s responses and adapting subsequent requests primarily based on found data.”

The assaults adopted a five-phase construction that elevated AI autonomy by means of each.

The life cycle of the cyberattack, exhibiting the transfer from human-led focusing on to largely AI-driven assaults utilizing numerous instruments, typically through the Mannequin Context Protocol (MCP). At numerous factors throughout the assault, the AI returns to its human operator for overview and additional route.

Credit score:

Anthropic

The attackers have been in a position to bypass Claude guardrails partly by breaking duties into small steps that, in isolation, the AI instrument didn’t interpret as malicious. In different circumstances, the attackers couched their inquiries within the context of safety professionals attempting to make use of Claude to enhance defenses.

As noted last week, AI-developed malware has an extended option to go earlier than it poses a real-world risk. There’s no purpose to doubt that AI-assisted cyberattacks could in the future produce stronger assaults. However the information up to now signifies that risk actors—like most others utilizing AI—are seeing combined outcomes that aren’t almost as spectacular as these within the AI business declare.

Source link

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

Indian IT companies have spent $7.1B on acquisitions since the start of 2025 to gain clients, as AI-led pricing pressure weakens organic growth (Shristi Achar/The Economic Times)

People Incorporated launches $18B bid for MGM Resorts

Illinois prediction markets face new transaction tax

Galveston gambling investigation expands with coordinated raids

Microsoft announces the Agent Control Specification, an open-source standard that aims to provide granular, consistent governance over AI agent behavior (Ram Iyer/TechCrunch)

Dozens of Red Hat packages backdoored through its official NPM channel

Strava Members: Run a 5K Wednesday, Get a Runna Subscription Free

I Spent May Evaluating Different Engines for OCR

Extra-wide tiny house combines premium finishes with spacious design

Property investment startup Dashdot in liquidation, with Budget as ‘the straw that broke the camel’s back’

Featured Picks

Water Cooler Small Talk, Ep. 9: What “Thinking” and “Reasoning” Really Mean in AI and LLMs

23 of Netflix’s Best Sci-Fi TV Shows to Stream Right Now

Crystal-based cooling for future gadgets could prevent overheating

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

How (Anthropic says) the assault unfolded

Related Posts