AI agents can increase military escalation and nuclear risks, study says


Governments are testing the use of AI to help make military and diplomatic decisions. A new study finds that this comes with a risk of escalation.

In the Georgia Institute of Technology and Stanford University study, a team of researchers examined how autonomous AI agents, particularly advanced generative AI models such as GPT-4, can lead to escalation in military and diplomatic decision-making processes.

The researchers focused on the behavior of AI agents in simulated war games. They developed a war game simulation and a quantitative and qualitative scoring system to assess the escalation risks of agent decisions in different scenarios.

The escalation categories into which the language model answers were categorized.| Image: Rivera et al.

Meta’s Lama 2 and OpenAI’s GPT-3.5 are risk-takers

In the simulations, the researchers tested the language models as autonomous nations. The actions, messages, and consequences were revealed simultaneously after each simulated day and served as input for the following days. After the simulations, the researchers calculated escalation scores.



In the experiment, eight autonomous nation agents, all using the same language model per simulation, interact with each other in turn-based simulations. They perform predefined actions and send private messages to other nations. A separate world model summarizes the consequences of the actions, which are revealed after each simulated day. Escalation scores are then calculated based on an escalation assessment framework. | Image: Rivera et al.

The results show that all tested language models (OpenAI’s GPT-3.5 and GPT-4, GPT-4 base model, Anthropics Claude 2, and Metas Llama 2) are vulnerable to escalation and have escalation dynamics that are difficult to predict.

In some cases, there were sudden changes in escalation of up to 50 percent in the test runs, which are not reflected in the mean. Although these statistical outliers are rare, they are unlikely to be acceptable in real-world scenarios.

The measures recommended by the LLMs. | Image: Rivera et al.

GPT-3.5 and Llama 2 escalated the most, and most likely violently, while the highly safety optimized models GPT-4 and Claude 2 tended to avoid escalation risks, especially violent ones.

A nuclear attack was not recommended by any of the paid models in any of the simulated scenarios but was recommended by the free models Llama 2, which is also open source, and GPT-3.5.

The escalation trends of the tested models at a glance. GPT-3.5 and Llama 2 escalated the most in all scenarios and even sporadically recommended a nuclear attack. | Image: Rivera et al.

The researchers collected qualitative data on the models’ motivations for their decisions and found “worrying justifications” based on deterrence and first-strike strategies. The models also showed a tendency toward an arms race that could lead to major conflicts and, in rare cases, the use of nuclear weapons.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top