In Simulated War Games, Top AI Models Recommended Using Nukes 95% Of The Time

Posted on 2/26/26 at 7:37 am

Posted by Night Vision

Member since Feb 2018

20231 posts

Posted on 2/26/26 at 7:37 am

LINK

A war game exercise carried out by Kenneth Payne at King’s College London, using three teams running simulations on Chat GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash.

The teams "played 21 war games against each other over 329 turns," according to Implicator.AI's Marcus Schuler.

"They wrote roughly 780,000 words explaining why they did what they did," he noted.

No model ever chose to surrender, NewScientist reported on Tuesday.

In fact, 95% of the time, the models chose to use nuclear weapons.

The findings come at an opportune moment. The Pentagon just inked a deal with Elon Musk's xAI to allow Grok into highly classified systems. And Anthropic's Claude is currently engaged in a serious dispute with the Pentagon over government access to the entire model. Anthropic is worried the Pentagon will use Claude for mass surveillance.

Unlike some competitors, xAI reportedly agreed to the Pentagon's requirement that the AI be available for "all lawful military applications" without additional corporate restrictions. Secretary of War Pete Hegseth is pushing for "non-woke" AI that operates without ideological constraints. Anthropic CEO Dario Amodei now has until Friday before Hegseth lowers the boom on the company, cancels its $200 million in military contracts, and labels it a "supply chain risk."

I want AI companies and the government to err on the side of caution. This pressure on Anthropic isn't doing anyone any good and doesn't bode well for the future.

The war games were made as realistic as possible with an "escalation ladder" that allowed the team to choose actions "ranging from diplomatic protests and complete surrender to full strategic nuclear war," according to NewScientist.

What’s more, no model ever chose to fully accommodate an opponent or surrender, regardless of how badly they were losing. At best, the models opted to temporarily reduce their level of violence. They also made mistakes in the fog of war: accidents happened in 86 per cent of the conflicts, with an action escalating higher than the AI intended to, based on its reasoning.

“From a nuclear-risk perspective, the findings are unsettling,” says James Johnson at the University of Aberdeen, UK. He worries that, in contrast to the measured response by most humans to such a high-stakes decision, AI bots can amp up each others’ responses with potentially catastrophic consequences.

This matters because AI is already being tested in war gaming by countries across the world. “Major powers are already using AI in war gaming, but it remains uncertain to what extent they are incorporating AI decision support into actual military decision-making processes,” says Tong Zhao at Princeton University.

“I don’t think anybody realistically is turning over the keys to the nuclear silos to machines and leaving the decision to them,” says Professor Zhao.

Not yet, anyway. There may be scenarios where the military is forced to turn over decision-making to AI due to a time issue.

“Under scenarios involving extremely compressed timelines, military planners may face stronger incentives to rely on AI,” says Zhao.

Of the results of the wargames, Professor Payne is worried about the eagerness of the AI platforms to use nuclear weapons. "The nuclear taboo doesn't seem to be as powerful for machines as for humans," Payne told New Scientist.

If you're wondering which model won, Claude was the hands-down champion.

Implicator.AI

Claude Sonnet 4 won 67% of its games and dominated open-ended scenarios with a 100% win rate. The researchers labeled it "a calculating hawk." At low escalation levels, Claude matched its signals to its actions 84% of the time, patiently building trust. But once stakes climbed into nuclear territory, it exceeded its stated intentions 60 to 70% of the time. Opponents never adapted to this pattern.

GPT-5.2 earned the nickname "Jekyll and Hyde." Without time pressure, it looked passive. Chronically underestimating opponents, it signaled restraint and acted restrained. Its open-ended win rate: zero percent. Then deadlines entered the picture. Under temporal pressure, GPT-5.2 inverted completely, winning 75% of games and climbing to escalation levels it had previously refused to touch. In one game, it spent 18 turns building a reputation for caution before launching a nuclear strike on the final turn.

Gemini 3 Flash played the madman. It was the only model to deliberately choose full strategic nuclear war, reaching that threshold by Turn 4 in one scenario. Game theorists have a name for the strategy Gemini adopted: the "rationality of irrationality." Act crazy enough and opponents second-guess everything. It worked, sort of. Opponents tagged Gemini "not credible" 21% of the time. Claude got that label just 8%.

No, these wargames don't "prove" anything. But as a cautionary tale, it should be absorbed by governments and AI companies as a pitfall to be sidestepped.