Artificial intelligence is not actually intelligent. Policy that treats AI as though it is intelligent could have significant ramifications and could result in significant material risks for individual citizens and companies alike.
Assumption: Artificial intelligence has unlimited potential to execute any task that ordinarily requires human intelligence, input, oversight and judgment.
Counterpoint: The technologies currently referred to as ‘artificial intelligence’ are inherently limited in their capacity to replicate human intelligence. Rather, they have only demonstrated themselves to be capable of imitating narrow facets of human intelligence in certain narrow tasks, and they could continue to be ineffectual for certain applications for many years to come.
In the policy sphere, AI is widely characterized as computerized technologies capable of executing tasks that ordinarily require human intelligence. This represents an ideological, rather than technical, notion that extends the scope of AI policy beyond the technology that exists today to include unlimited technologies that may not yet exist for years to come, if they ever do. In some governments it is even a matter of policy to assume that the intelligence of computerized systems will continue to grow indefinitely, to the point of achieving ‘artificial general intelligence’ that matches or exceeds human mental capacity. It is in light of this definition that AI is commonly likened to electricity: a general-purpose technology of boundless potential, which will have, to quote one strategy, ‘a transformational impact on the whole economy’. This common policy-level view of AI is technologically problematic, and it risks derailing AI policy in several key ways.
Natural limits
It is indisputable that algorithmic technologies have proven to be effective – and in some cases transformative – in certain applications such as digital marketing, social media, web search, some domains of finance and computer vision. Nevertheless, recent years have also provided ample evidence that AI still falls far short of being an artificial version of human intelligence. AI has exhibited persistent failures across a wide set of domains. In one way or another, these failures stem from the fact that these systems do not yet replicate the fundamental capacity of real human intelligence to account for ambiguity and anomalies, to understand concepts and their relation to each other, to adapt to new information, to grasp the difference between truth and untruth, and to take non-numerical factors into account in problem-solving.
AI’s most commonly cited breakthroughs in digital environments cannot be used as proof that AI will succeed in the real physical world, let alone in other domains, or in safety-critical settings.
AI’s most commonly cited breakthroughs in digital environments – computers that beat humans at games like chess, go and Starcraft, for example, or that can design new proteins and medicines, or generate digital content – cannot be used as proof that AI will succeed in the real physical world, let alone in other domains, or in safety-critical settings. In other realms, the research community has experienced endemic failures in achieving the same performance that machine learning exhibits in testing when it is deployed in real-world conditions.
Proof of AI’s shortcomings is plentiful. Large language models, which may be capable of writing a passable high school essay or engaging a user in a coherent conversation, consistently make highly unpredictable errors despite being trained on vast and extensive volumes of human language. In transportation, despite hundreds of billions of dollars in investment, autonomous vehicles have yet to be deployed at scale, as they continue to be prone to failures when encountering specific unfamiliar conditions and situations on the road. Uber, a presumed leader in the autonomous vehicle field, abandoned its self-driving vehicle division in 2020. Free-ranging autonomous robotic systems with no human aboard remain confined mostly to experimental settings.
In medicine, another highly touted critical application area, high-profile AI programs have so far yielded meagre performance gains compared to traditional systems or measures, and have also resulted in excess harms. Several states have used the onset of the COVID-19 pandemic to call for quicker AI adoption, arguing that the technology could be useful in medical response and epidemiological forecasting, though studies have shown that AI experiments related to COVID-19 have largely fallen short of expectations.
To be sure, in any given area to which AI is being applied, it is possible to point to experimental efforts that have shown encouraging early performance. And it is undeniable that someday AI will succeed in areas where it currently struggles. But we do not have hard evidence as to which applications these will be. We also cannot say with certainty when these barriers to AI’s success will fall. AI progress has never tracked along a linear growth curve – rather, it has moved along a series of largely unpredictable ‘AI springs’ and ‘AI winters’. If we are still in the early days of a lengthy AI boom, these breakthroughs could be right around the corner. But if we are already witnessing the waning days of this AI boom cycle, they could still be a long way off.
Proven risks, conjectured rewards
Because the predominant understanding of AI assumes that significant advances are always imminent, there is a tendency in AI policy literature to grant the technology’s expected benefits equivalent weight as its known shortcomings and risks. For example, one strategy notes that ‘algorithmic risk assessment tools… have the potential to improve consistency and predictability’ in pre-trial detention, sentencing and bail decisions while noting that ‘the use of AI within the justice sector also has considerable implications for ethics, human rights and the rule of law’. At face value, this would appear to be a balanced characterization. However, the strategy misrepresents the evidence: there is ample proof of the technology’s harms and fairly uneven evidence of proven benefits. Achieving a true balance between the actual benefits and risks of predictive policing would likely depend on technological breakthroughs that remain speculative.
The risks and benefits of AI also tend to track along vastly different scales. An improvement in efficiency that may come from the use of a fraud detection algorithm is in no way comparable to the harm that results when that system leads to a wrongful accusation. The gains that an HR department might see from using a hiring algorithm are moot if that algorithm systematically (and from the user’s perspective invisibly) privileges applicants from a particular demographic group. Treating risk as something that can be weighed against benefit is therefore misleading. At a practical level, it potentially undermines the capacity for state mechanisms that sponsor or regulate AI development to differentiate AI tools that should be pursued from those that absolutely should not.
In the absence of such discernment, the harms of misapplied AI are rarely distributed evenly. Because AI limitations often manifest themselves in ways that reflect bias, the use of AI in tasks for which it is technically ill-suited or in contexts lacking sufficient regulatory guardrails poses an elevated risk of harm to members of vulnerable or historically disadvantaged groups. In addition, yet-to-be-proven AI technologies are more likely to be used against populations with less policy leverage to advocate for protections or moratoriums (consider, for example, early experiments involving AI for welfare fraud detection, predictive policing and surveillance in public housing).
To engender the safest possible regulations, it would be more appropriate for generalizations about a technology’s readiness to weigh AI’s failures more heavily than its successes. This reframing will be especially helpful when these failures fall along consistent patterns that might suggest that its ‘growing pains’ are actually inherent limitations. Entities evaluating a proposed AI role might use the following formula to judge whether it is worth exploring further: If it can be accomplished without increasing harms, to the best of one’s knowledge, this application of AI has the potential to yield gains – however, if there is evidence that the technology will increase risks to some groups, further study would be necessary prior to the actual use or real-world testing of the technology.
In this spirit, some AI policies have proposed or enacted moratoriums on particular applications of AI, such as live biometric surveillance, given that (according to these policies) no conjectured benefits yielded by such applications could outweigh their amply demonstrated risks for the foreseeable future. At a minimum, policies could be transparent about their acceptance of a certain degree of risk in the pursuit of a particular application. In such cases, they should be specific about who will most likely be affected by these risks, and then seek the buy-in of those communities before proceeding.
Ethical unintelligence
The ‘intelligence’ framing of AI muddles the discourse on how to prevent and respond to AI harms. If one assumes that a computer could replicate human intelligence, this further assumes that the system, like a human, could include ethical mores in the parameters that guide its actions. It is certainly true that in some cases, computation has fruitfully displaced human judgment (consider, for example, autopilots on aeroplanes). However, those systems generally only mimic a very narrow column of human intelligence related to information retrieval, rule-based processing and pattern recognition – none of which can be mapped to ethical reasoning. Meanwhile, when a sophisticated AI fails it often does so in ways that no human possessing a modicum of genuine intelligence ever would. This makes it hard to rate and account for AI’s reliability using the same metrics and tools that existing regulatory frameworks use for humans and predictable systems, such as mechanical components, that are not based on probabilistic AI. A more grounded strategy to implement ‘ethical AI’ would acknowledge that AI systems themselves can never, for instance, be ‘held accountable’ or be ‘trustworthy’ in the human-ethics sense.
The predominant understanding of AI as ‘intelligent’ also assumes that present-day challenges for ethical AI, such as a lack of system predictability and transparency, can be engineered away by increasing the intelligence of these systems. In truth, increasing system ‘intelligence’ (which can increase the system’s autonomy and open-endedness) may in fact be more likely to compound ethical challenges. Large language models demonstrate this phenomenon; the popular AI-based chatbot ChatGPT, for example, has raised myriad questions relating to intellectual property and fair-use, how to algorithmically balance safety against free speech, and where to assign responsibility for harms. An alternative strategy for ethical AI might therefore focus on non-technical (or even non-AI-related) measures to combat harms: for example, anti-corruption measures to improve institutional accountability or racial justice campaigns to improve societal equity (see Chapter 5).
Dollars and sense
The assumption that AI will be as transformative and essential to every aspect of modern life as electricity can stifle any measure that would place limits on AI use or capacity, even if those measures have other direct benefits. For example, a number of national AI strategies are explicit that their intent is not so much to explore the possibilities and limitations of AI; rather it is, as one strategy put it, to ‘drive AI adoption in the private and public sectors’ and, to quote another, to ‘support the diffusion of AI across the whole economy’. One strategy even proposes ‘adding the requirement for AI-based solutions in the specifications of other strategic investments … financed from public funds’. That is, the state’s critical decisions on strategic investments will be informed, in part, on whether the proposed investment involves ‘AI’. The implication being that if the state is considering two proposed investments, one which involves AI and another which does not, the former will receive priority.
Putting aside the ethical perils of this mindset, it could also be economically risky. Citing forecasts that estimate trillions of dollars of positive economic impact stemming from AI, government investments in AI research and development have mushroomed in recent years. Yet studies have shown that while adoption of AI across industries is growing, a significant proportion of enterprises that have embarked on AI projects have yet to see any substantial gains from the technology. Recent polling also indicates that a large share of the machine-learning models that are developed fail to reach deployment. To be sure, any government effort to make technological breakthroughs requires some tolerance for financial risk. But as the current AI boom cycle enters its second decade without having achieved the scale of systemic adoption that was once expected, it is worth asking whether these financial risks may in fact be larger than previously assumed, and whether they will be fairly distributed across sectors and groups.
A safer policy of investment might be one that accounts for the reality that AI development could continue to progress along a boom-and-bust cycle of growth. However, national AI strategies generally do not include measures to reduce the state’s exposure to the financial losses that would be incurred if AI progress continues to fail in real-world use. This could be a particular concern for low-income states, which might over-expose themselves in risky AI while under-investing in other technologies or areas that may be essential for prosperity and sustainable growth, such as agricultural technologies, clean energy, sanitation and public health. (If anything, a state may only be able to truly enjoy the benefits of AI if it has first cleared other development milestones. For example it is unlikely that an AI system for better healthcare will be of universal benefit to a population where there is a lack of birth registrations or where non-male individuals are conferred fewer rights to government services.)
Seeking a new term
The aspirational view of AI outlined in this section is so dominant in the discourse that even when policy language does not provide a concrete definition, the mere use of the term ‘artificial intelligence’ connotes this view of what the technology is (or is imagined to be). As a result, there have been calls in recent years to replace the term ‘artificial intelligence’ with specific terminology that foregrounds the technical realities of the systems in question and the parties responsible for its use.
For example, the Center on Privacy and Technology at Georgetown Law in Washington, DC, has announced that it will cease to use ‘AI’, ‘artificial intelligence’ and ‘machine learning’ in its work and will instead use specific terminology. Rather than stating that ‘employers are using AI to analyze workers’ emotions’, the Center’s staff will now use language such as: ‘employers are using software advertised as having the ability to label workers’ emotions based on images of them from photographs and video. We don’t know how the labeling process works because the companies that sell these products claim that information as a trade secret.’
Another potential approach is to develop follow-up questions to accompany mention of AI in the policy sphere. These could include questions like ‘what type of AI?’; ‘whose AI?’; ‘who built this AI?’; ‘was this AI built for this specific purpose?’; and ‘is this AI deployed, under development, or at the concept stage?’.