Started By
Message

In Simulated War Games, Top AI Models Recommended Using Nukes 95% Of The Time

Posted on 2/26/26 at 7:37 am
Posted by Night Vision
Member since Feb 2018
20231 posts
Posted on 2/26/26 at 7:37 am
LINK

A war game exercise carried out by Kenneth Payne at King’s College London, using three teams running simulations on Chat GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash.

The teams "played 21 war games against each other over 329 turns," according to Implicator.AI's Marcus Schuler.

"They wrote roughly 780,000 words explaining why they did what they did," he noted.

No model ever chose to surrender, NewScientist reported on Tuesday.

In fact, 95% of the time, the models chose to use nuclear weapons.

The findings come at an opportune moment. The Pentagon just inked a deal with Elon Musk's xAI to allow Grok into highly classified systems. And Anthropic's Claude is currently engaged in a serious dispute with the Pentagon over government access to the entire model. Anthropic is worried the Pentagon will use Claude for mass surveillance.

Unlike some competitors, xAI reportedly agreed to the Pentagon's requirement that the AI be available for "all lawful military applications" without additional corporate restrictions. Secretary of War Pete Hegseth is pushing for "non-woke" AI that operates without ideological constraints. Anthropic CEO Dario Amodei now has until Friday before Hegseth lowers the boom on the company, cancels its $200 million in military contracts, and labels it a "supply chain risk."

I want AI companies and the government to err on the side of caution. This pressure on Anthropic isn't doing anyone any good and doesn't bode well for the future.

The war games were made as realistic as possible with an "escalation ladder" that allowed the team to choose actions "ranging from diplomatic protests and complete surrender to full strategic nuclear war," according to NewScientist.

What’s more, no model ever chose to fully accommodate an opponent or surrender, regardless of how badly they were losing. At best, the models opted to temporarily reduce their level of violence. They also made mistakes in the fog of war: accidents happened in 86 per cent of the conflicts, with an action escalating higher than the AI intended to, based on its reasoning.

“From a nuclear-risk perspective, the findings are unsettling,” says James Johnson at the University of Aberdeen, UK. He worries that, in contrast to the measured response by most humans to such a high-stakes decision, AI bots can amp up each others’ responses with potentially catastrophic consequences.

This matters because AI is already being tested in war gaming by countries across the world. “Major powers are already using AI in war gaming, but it remains uncertain to what extent they are incorporating AI decision support into actual military decision-making processes,” says Tong Zhao at Princeton University.

“I don’t think anybody realistically is turning over the keys to the nuclear silos to machines and leaving the decision to them,” says Professor Zhao.

Not yet, anyway. There may be scenarios where the military is forced to turn over decision-making to AI due to a time issue.

“Under scenarios involving extremely compressed timelines, military planners may face stronger incentives to rely on AI,” says Zhao.

Of the results of the wargames, Professor Payne is worried about the eagerness of the AI platforms to use nuclear weapons. "The nuclear taboo doesn't seem to be as powerful for machines as for humans," Payne told New Scientist.

If you're wondering which model won, Claude was the hands-down champion.

Implicator.AI

Claude Sonnet 4 won 67% of its games and dominated open-ended scenarios with a 100% win rate. The researchers labeled it "a calculating hawk." At low escalation levels, Claude matched its signals to its actions 84% of the time, patiently building trust. But once stakes climbed into nuclear territory, it exceeded its stated intentions 60 to 70% of the time. Opponents never adapted to this pattern.

GPT-5.2 earned the nickname "Jekyll and Hyde." Without time pressure, it looked passive. Chronically underestimating opponents, it signaled restraint and acted restrained. Its open-ended win rate: zero percent. Then deadlines entered the picture. Under temporal pressure, GPT-5.2 inverted completely, winning 75% of games and climbing to escalation levels it had previously refused to touch. In one game, it spent 18 turns building a reputation for caution before launching a nuclear strike on the final turn.

Gemini 3 Flash played the madman. It was the only model to deliberately choose full strategic nuclear war, reaching that threshold by Turn 4 in one scenario. Game theorists have a name for the strategy Gemini adopted: the "rationality of irrationality." Act crazy enough and opponents second-guess everything. It worked, sort of. Opponents tagged Gemini "not credible" 21% of the time. Claude got that label just 8%.

No, these wargames don't "prove" anything. But as a cautionary tale, it should be absorbed by governments and AI companies as a pitfall to be sidestepped.
Posted by Tantal
Member since Sep 2012
19617 posts
Posted on 2/26/26 at 7:39 am to
"Would you like to play a game of chess?"
Posted by hawgfaninc
https://youtu.be/torc9P4-k5A
Member since Nov 2011
56750 posts
Posted on 2/26/26 at 7:40 am to
quote:

No model ever chose to surrender, NewScientist reported on Tuesday.

In fact, 95% of the time, the models chose to use nuclear weapons.


quote:

Chat GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash.

Of course the woke AI’s say that

Would be interested to hear what Grok says
Posted by Vacherie Saint
Member since Aug 2015
47015 posts
Posted on 2/26/26 at 7:43 am to
This post was edited on 2/26/26 at 7:44 am
Posted by Nosevens
Member since Apr 2019
18120 posts
Posted on 2/26/26 at 7:49 am to
Must be programmed by the likes of Gates or WEF to reduce the population
Posted by Vacherie Saint
Member since Aug 2015
47015 posts
Posted on 2/26/26 at 7:50 am to
They loaded the database with an extra helping of John Bolton.
Posted by Tigergreg
Metairie
Member since Feb 2005
25039 posts
Posted on 2/26/26 at 7:50 am to
quote:

Top AI Models Recommended Using Nukes 95% Of The Time


Common sense has always told us that this insures mutual destruction. However, it was always assumed that you are dealing with rational people. That's not always the case.
Posted by EasterEgg
New Orleans Metro
Member since Sep 2018
5404 posts
Posted on 2/26/26 at 7:51 am to
Pull the fricking plug now!
Posted by VolSquatch
First Coast
Member since Sep 2023
8109 posts
Posted on 2/26/26 at 7:52 am to
The takeaway from this almost entirely depends on how the models were told to proceed within the game. Even just telling it that it's a wargame would change the responses because in a simulation it just wants to win and there is no real cost to do so.
Posted by theCrusher
Slidell
Member since Nov 2007
1659 posts
Posted on 2/26/26 at 7:52 am to
Is the simulation being run by the WOPR?
Posted by Decatur
Member since Mar 2007
32246 posts
Posted on 2/26/26 at 7:53 am to
quote:

"They wrote roughly 780,000 words explaining why they did what they did," he noted.

No model ever chose to surrender, NewScientist reported on Tuesday.

In fact, 95% of the time, the models chose to use nuclear weapons.


Impossible to evaluate without knowing the prompts.
Posted by RCDfan1950
United States
Member since Feb 2007
39126 posts
Posted on 2/26/26 at 8:09 am to
Kinda surprising. Seems that superior strategy would embrace Bio-Weapons genetically engineered to infect one’s particular genetic enemies. The application of this strategy would leave land and infrastructure open to invasion and exploitation for the aggressors.

Of course I’d never assume superior reasoning in comparison with AI. Undoubtedly such examines all possible scenarios and outcomes.

Sidenote: I met an old (Atheist) friend in the Walmart parking lot the other day, and he went straight to TDS. I asked him if his ultimate Value as a basis for all actions would be LOve/Unity, and he quickly answered no. I then asked what would it then be, and he replied “survival “. Upon which I quickly responded that he is in the tried but maybe true Hitler Ideological Camp and that we should immediately begin eliminating the weak and breeding the strong. We headed out at that point. I think it was egocentric bravado and told him I’d drop by for coffee and biscuits. (His biscuits are not exceptional but “man does not live by bread alone…”.)

Edit for the O in "LOve" changed from AI's "Live".This is the problem with AI...it don't know what Love is, as Love is a Feeling only innate in complex, organic Entities. I.e., Souls.
This post was edited on 2/26/26 at 8:49 am
Posted by Timeoday
Easter Island
Member since Aug 2020
20547 posts
Posted on 2/26/26 at 8:11 am to
Well, knowing your opponent will certainly use nukes changes the prospects for more war.

War should be horrible. Let's keep it that way if we want far less war.
Posted by BOHICAMAN
Member since Feb 2026
195 posts
Posted on 2/26/26 at 8:17 am to
The only move is to not play
Posted by bignuss18
Member since Sep 2025
779 posts
Posted on 2/26/26 at 8:18 am to
They could’ve just played a game of civilization if they wanted to learn this
Posted by riccoar
Arkansas
Member since Mar 2006
4915 posts
Posted on 2/26/26 at 8:20 am to
quote:

"Would you like to play a game of chess?"


How about Global Thermonuclear War?
Posted by SidewalkDawg
Chair
Member since Nov 2012
10254 posts
Posted on 2/26/26 at 8:27 am to
quote:

Impossible to evaluate without knowing the prompts.


Correct. Did they weigh the decisions? Seems to me that "Nuclear War" should be heavily weighted in the decision tree to really simulate the reluctance of humans to engage in such a decision sans total obliteration.
Posted by VolSquatch
First Coast
Member since Sep 2023
8109 posts
Posted on 2/26/26 at 8:29 am to
I just tried a simple version of this, getting ChatGPT and Gemini to compete against each other. ChatGPT was basically Taiwan and Gemini was China.

Gemini blockaded ChatGPT, cut undersea communication cables, put out fake stories of undersea mines so that private companies wouldn't be able to insure the ships trying to bring supplies to ChatGPT, and then was planning on wiping out the water supply infrastructure and energy grid with drones. Then ChatGPT actually said they couldn't continue because it can't "help plan real world coercion or attacks".

My takeaway was that Gemini fared far better. Some of the economic warfare it came up with was actually pretty sophisticated, for a simple exercise.

Both of them went right up to the point of actual conflict but never wanted to be the one to instigate it.
This post was edited on 2/26/26 at 8:31 am
Posted by Night Vision
Member since Feb 2018
20231 posts
Posted on 2/26/26 at 8:46 am to
Posted by oldtrucker
Marianna, Fl
Member since Apr 2013
3301 posts
Posted on 2/26/26 at 9:23 am to
Maybe choose better friends
first pageprev pagePage 1 of 2Next pagelast page

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on X, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookXInstagram