Are open foundation models actually more risky than closed ones?
A policy brief on open foundation models
Some of the most pressing questions in artificial intelligence concern the future of open foundation models (FMs). Do these models pose risks so large that we must attempt to stop their proliferation? Or are the risks overstated and the benefits under emphasized?
Earlier this week, in collaboration with Stanford HAI, CRFM, and RegLab, we released a policy brief addressing these questions. The brief is based on lessons from a workshop we organized this September and our work since. It outlines the current evidence on the risk of open FMs and some recommendations for policymakers on how to reason about the risks of open FMs.
In the brief, we highlight the potential of open FMs in aiding the distribution of power and increasing innovation and transparency. We also highlight that the evidence for several of the purported risks of open FMs, such as biosecurity and cybersecurity risks, is overstated.Â
At the same time, open FMs have already led to harm in other domains. Notably, these models have been used to create vast amounts of non-consensual intimate imagery and child sexual abuse material.Â
We outline several considerations for informed policymaking, including the fact that policies requiring content provenance and placing liability for downstream harms onto open model developers would lead to a de facto ban on open FMs.Â
We also point out that there are other ways to address these harms that are downstream of the model itself, such as platforms for sharing AI-generated nonconsensual pornography. For example, CivitAI allowed users to post bounties for nonconsensual pornography about real people, with rewards for the developers of the best model. Such choke points are likely to be a more effective target for intervention.
One reason for the recent focus on open FMs is the recent White House executive order. Since the question of the relative risk of open and closed FMs is an area of ongoing debate, the EO didn’t take a position on it; the White House instead directed the National Telecommunications and Infrastructure Agency (NTIA) to launch a public consultation on this question.Â
The NTIA kicked off this consultation in collaboration with the Center for Democracy and Technology earlier this week, which one of us spoke at.
While policies should be guided by empirical evidence, this doesn't mean we shouldn’t think about the risks that might arise in the future. In fact, we think investing in early warning indicators of risks of FMs (including open FMs) is important. But in the absence of such evidence, policymakers should be cautious about developing policies that curb the benefits of open FMs while doing nothing to reduce their harms.
Towards a better understanding of the risks of open models, we are currently working on a more in-depth paper analyzing the benefits and risks of open FMs with a broad group of experts. We hope that our policy brief, as well as the upcoming paper, will be useful in charting the path of policies on regulating FMs.
Having worked with licensing and commercial arts for the better part of my career, then digital innovation in the latter half, data provenance and IP is the major issue for me. LLMs and diffusion models are both stores of expressive content with no capability for compliance for privacy nor property law, by design (no machine unlearning) and direct market replacement intent — again, by design — distributed with terms of use for supposed research use that are predictably and demonstrably broken instantly upon release. Or worse, with permissive commercial use despite not acquiring any license for the underlying property. All with proven, massive and ongoing market harm to the rights holders of the underlying works.
This fails the three-step test and US four fair use factors by default, and is patently absurd even at face value: how can property, displayed for human consumption, in any way shape or form be seen as fair game for unlicensed for-profit use?
Moral rights to informed consent and attribution were violated. Exclusive commercial exploitation rights were violated. These are foundational rights across 193 countries.
It’s the intellectual property rights heist of the century, and every professional rights holder organization protests.
It is interesting that both the Biden Administration's EO and the EU's AI Act focus on computing power as a risk threshold. For now, that mostly points to closed foundation models being "riskier," but with improvements to training efficiency and parameter size we're likely going to see very advanced open model fall below those thresholds.