The mainstream availability and rapid technological advancement of large language artificial intelligence models such as ChatGPT have heightened worries about what the adoption of AI broadly might mean for high-stakes decision-making in situations where the outcome might be catastrophic. The use of nuclear weapons is one such situation.
But what are the risks of integrating AI into nuclear decision-making structures? Relevant AI models can be based on two methods: rules-based systems or neural networks. Rules-based systems have a strict underlying logic which means that the decisions they make are consistent and can be replicated, because they follow the same rules each time.
Neural networks on the other hand try to emulate how the human brain makes decisions. These models recognize patterns in data – but not in the same way that humans do. This makes it difficult to understand how a decision is made, and it cannot necessarily be replicated.
Errors of judgement
A key risk with neural network-based models, therefore, is that the AI makes the wrong decision, and human operators do not have sufficient time or information to challenge that decision. This makes them dangerous for situations in which a wrong decision carries high stakes – such as the decision whether or not to launch nuclear weapons.
Because of these concerns around errors of judgment and accountability for decision-making, there is a strong debate about the application of AI in military technology. A nightmare scenario describes an inadvertent escalation caused by a machine independently selecting and firing on a target without human oversight. In reality, lethal fully autonomous weapons remain a relatively remote possibility.
However, AI is already used in a range of military applications. These are primarily decision-support systems rather than systems that make their own decisions. But in some cases such as target identification and selection, and due to the high volume of data being processed, there are concerns that humans do not have sufficient time and autonomy to assess AI-generated recommendations.
Early warning systems play an important role in managing nuclear risks. These use a combination of different sensors to monitor potential adversaries for any incoming attack. An AI model can recognize patterns faster and analyze data more comprehensively than a human. Using AI in early warning systems could decrease risks by allowing leaders more time to make decisions and to de-escalate.
But risks could increase through false positives – for instance, if an AI model in an early warning system identifies an attack when there is none. This preoccupies nuclear analysts as such an example took the world one step closer to nuclear war in September 1983, when a Soviet early warning system gave an alert that the United States had launched five intercontinental ballistic missiles at the Soviet Union.
The officer on duty, Lieutenant Colonel Stanislav Petrov, was sceptical as it seemed unlikely the US would only launch five missiles in an attack. He also knew that the early warning system was new and relatively untested. Petrov decided that this was probably a false positive and did not report an incoming attack. It turned out that the early warning system had falsely interpreted unusual sunbeams passing through clouds as incoming missiles. Petrov was officially reprimanded for not following protocol – but he saved the world from inadvertent nuclear escalation. Given the difficulty in explaining neural networks’ decision-making models, a false positive could have devastating consequences.
False positives
There are also concerns about risks from false-positive launch decisions in missile defence systems. The US Patriot system downed several friendly aircraft in 2003 despite human operators being in the loop. Israel’s Lavender target identification system is reported to have a high rate of error, and human operators have been reported not to have had sufficient time to verify machine recommendations. While these are conventional systems, the errors made had lethal consequences, killing several people who otherwise would not have died.
If a false decision was made involving nuclear weapons, the consequences would be even graver. This raises the important question of the level of error we are willing to tolerate when using AI systems. AI systems require large amounts of training data to improve their performance. However, for nuclear decision-making, little nuclear crisis data exists. This makes it more likely that a system trained on other crisis data would miss or misinterpret signals in a nuclear crisis.
A nuclear crisis has all the elements of a scenario AI cannot handle: not only is there little training data, the event itself is rare and context-dependent, influenced by the history and the strategic culture of the nuclear armed states involved. The difference between peacetime and crisis data seems to be one of the key challenges in being able to ensure that AI is safe to use for nuclear decision-making tasks.
Need for diplomacy
Luckily, leaders are recognizing that, despite AI’s advantages, the risks are grave. This has provided opportunities for diplomacy to mitigate these risks. Presidents Xi Jinping and Joe Biden already agreed to keep AI out of nuclear launch decisions. This agreement could be widened to all nuclear-armed states. If a joint diplomatic statement is a step too far, leaders could make unilateral statements declaring their intent not to use AI in nuclear launch decisions.
Renewing strategic stability talks between all nuclear-armed states that include discussions of how they see the role of AI in their military systems, and nuclear systems especially, would go a long way towards reducing risks of inadvertent escalation. These talks could build confidence and could over time lead to additional agreements about how AI systems can be used and what their limitations are.