Licensing is neither feasible nor effective for addressing AI risks
Non-proliferation only benefits incumbents
Many people and organizations have argued that one way to make AI safer is by non-proliferation, similar to nuclear weapons or human cloning.
This would mean that only certain licensed companies and organizations can build state-of-the-art AI models. The argument is that this would allow the government to contain the spread of harmful AI models, have oversight into what AI tools are built and how they are deployed, and thus be able to guide the development of AI in a socially beneficial way.
Similar arguments underlie other attempts to forestall AI development, for instance, in the Future of Life Institute's letter arguing for a six-month pause on training models bigger than GPT-4.
Licensing is infeasible to enforce
If regulators commit to non-proliferation, how would they enforce it? It is not enough to require that developers obtain licenses, because malicious actors can simply ignore the licensing requirements and proceed with training AI models.
One approach that’s been suggested is to surveil data centers where models are trained and hosted. This proposal relies on the fact that AI requires lots of computing resources. In theory, data centers would need to report when a customer uses above a certain level of computing resources, which could prompt an investigation.
But as algorithmic and hardware improvements reduce costs for training models at a given capability level, this approach would require increasingly draconian surveillance measures and an unprecedented level of international cooperation to be effective. Though training state-of-the-art models like GPT-4 is expensive, the cost is rapidly dropping, both because of the decrease in hardware costs and due to improvements in the algorithms used to train AI models.
To see the long-term trends, we can look at computer vision. Between 2012 and 2019, training an image recognition classifier with the same performance became 44 times cheaper.
As of this writing, one of the most capable open-source language models, Falcon, required fewer than 400 GPUs and only two months to train. We estimate that such a model can be trained for a cost of less than USD 1 million.1
Further, the technical know-how required to build large language models is already widespread, and there are several open-source LLMs developed by organizations that share their entire code and training methodology. Because of these reasons, non-proliferation is infeasible from an enforcement perspective.
OpenAI and others have proposed that licenses would be required only for the most powerful models, above a certain training compute threshold. Perhaps that is more feasible — but again, unless all capable actors voluntarily comply, enforcement would be possible only by giving governments extraordinary powers.
And let's be clear about what it could accomplish: It would, at most, buy a few years of time. For any given capability threshold, training costs are likely to keep dropping due to both hardware and algorithmic improvements. Once the costs are low enough, the set of actors who can fund the development of such models will be too high to police.
Licensing will increase concentration and may worsen AI risks
Despite the proliferation of AI models themselves, a small number of big tech companies stand to profit greatly from the generative AI wave: whether by integrating it into their apps (e.g., Google, Microsoft, and Apple), by selling API access (e.g., OpenAI and Anthropic), or by selling hardware (NVIDIA).
Such concentration harms competition. Committing to non-proliferation would further increase this risk, because only a handful of companies would be able to develop state-of-the-art AI.
Further, decades of experience in the field of information security suggest that it’s better to deal with with security risks openly instead of "security through obscurity." In particular, non-proliferation and the resulting concentration of power would impact five major AI risks:
Monoculture may worsen security risks. When thousands of apps are all powered by the same model (GPT-3.5 is already in this position today), security vulnerabilities in this model can be exploited across all of these different applications.
Monoculture may lead to outcome homogenization. The use of the same AI model across different applications increases homogenization, including in consequential settings such as resume screening. If a candidate applies to multiple jobs, instead of being evaluated independently by the different companies, they could be rejected from all of them if all companies use the same AI tool for hiring.
Defining the boundaries of acceptable speech. In some ways, generative AI apps are similar to social media platforms where people generate and consume content. If most people use models created by a small group of providers, these developers get outsized power in defining the Overton windows of acceptable speech, by governing what is and isn't allowed in conversations.
Influencing attitudes and opinions. If people use chatbots as conversation partners, the “opinions” expressed by them could influence people’s views on a massive scale. There is evidence for this from a recent study.
Regulatory capture. The ongoing lobbying for a licensing regime can be seen as regulatory capture. If it succeeds, it would give AI companies even more power in policy debates. Instead of engaging with the arguments, they could dismiss critics as uninformed outsiders lacking expertise on AI capabilities and risks.
One way to avoid concentration is the development and evaluation of state-of-the-art models by a diverse group of academics, companies, and NGOs. We believe this would be a better way to uncover and address AI risks. Of course, open-source AI presents its own risks and requires guardrails. What might those guardrails look like? We plan to share a few ideas soon.
The model was trained on 384 Nvidia A100 GPUs for 60 days. The hourly rate of using one such GPU on AWS varies from USD 1.44 to USD 4.10. Bulk costs are generally lower than hourly costs.
The most important part of this narrative for me is how licensing and regulation will cripple small businesses and empower big tech companies. It’s already so hard and expensive to start a successful software business. As a working class, we’re already at a severe disadvantage because of how much of our digital lives and the internet is in control of a few big tech companies and telecom providers. We need more startups and open source competition in this space to help distribute power and wealth.
The important part of this for consumers to understand is that developing and distributing AI systems and ML models is going to continue to get (exponentially) easier and cheaper. It’s becoming just as easy as building a website or mobile app and think about how many of those exist.
Similarly, AI and ML systems can be built and deployed on any computer. Most people have several in their home. Computing is ubiquitous and so will be these systems. This also means anyone can build these systems and hide their development as well as anything else you can hide on a computer.
Licensing or any other solution focused on the technology will not work simply because the problem is not technology but the economic system that requires it to be exploited. Witness the spectacle of the brightest minds researching AI simultaneously warning of the potential for extinction and begging the government to slow them down or even stop them. If AI is so dangerous, why not just stop? Because our economic system requires they continue, regardless of the consequences. Until that problem is solved, nothing else will work.