We’ve written over 60,000 words on this newsletter — almost the length of a book! There are four main themes:
Resisting AI doom narratives
Debunking hype about AI’s capabilities and transformative effects
Understanding why AI evaluation is so tricky (which is one root cause of AI hype)
Helping AI policy stay grounded in the evidence.
Here are 40 essays from the newsletter on these four topics.
We hope you will also buy the book — while the themes are largely the same, there is very little overlap in the contents.
1. AI safety
We rebut alarmism about catastrophic AI risks and analyze flawed claims about how to make it safer.
AI safety is not a model property
Trying to make an AI model that can’t be misused is like trying to make a computer that can’t be used for bad things. Safety depends on the context and the environment in which the AI model or AI system is deployed. So fixing AI safety by tinkering with models is unlikely to be fruitful. Even if models themselves can somehow be made “safe”, they can easily be used for malicious purposes. Safety guardrails must primarily reside outside the model.
Model alignment protects against accidental harms, not intentional ones
By Arvind Narayanan, Sayash Kapoor, and Seth Lazar
Model alignment refers to modifying AI models to make them “safe”. We argue that this far more suited for curbing accidental harms (such as users unintentionally being shown offensive content) rather than intentional harms (such as curbing misuse by sophisticated adversaries). The hand wringing about failures of model alignment is misguided.
A misleading open letter about sci-fi AI dangers ignores the real risks
In March 2023, the Future of Life Institute released an open letter asking for a 6-month pause on training language models “more powerful than” GPT-4. Unfortunately, the letter's key premises relied on speculative risks and ignored the real risks of overreliance, centralization, and near-term security concerns.
Is Avoiding Extinction from AI Really an Urgent Priority?
By Seth Lazar, Jeremy Howard, and Arvind Narayanan
Two months later, the Center for AI Safety released a twenty-two-word statement that simply said: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war." We argue against alarmism about AI existential risk and instead advocate for building strong institutions that both reduce AI risks and put us in a position to better respond to future risks.
Are open foundation models actually more risky than closed ones?
Is AI so dangerous that we must stop its proliferation, such as by prohibiting the open release of model weights? In September 2023, we organized a one-day virtual workshop on responsible and open foundation models. Watch the workshop video here. In December 2023, we released a policy brief on the topic, co-authored with Stanford researchers.
On the Societal Impact of Open Foundation Models.
By Sayash Kapoor, Rishi Bommasani, Daniel E. Ho, Percy Liang, and Arvind Narayanan
We continued our in-depth analysis over the next three months. In February 2024, we released a paper that contributed a framework to analyze the marginal (that is, additional) risk of open foundation models compared to closed models and existing technology such as the internet. The paper has since been peer reviewed (ICML 2024).
The LLaMA is out of the bag. Should we expect a tidal wave of disinformation?
When Meta's LLaMA language model was released in February 2023 (and subsequently, its model weights leaked in March 2023), researchers and commentators warned of a tidal wave of disinformation. We were skeptical. After all, the supply of misinformation has rarely been the bottleneck for successful disinformation operations. We wrote this essay to counter some of the early narratives that equated capable open language models with unending disinformation.
How to Prepare for the Deluge of Generative AI on Social Media
Knight First Amendment Institute essay series
We followed up on this essay with an analysis of what types of uses generative AI is most likely to find on social media. Our analysis highlighted the many benign uses; in terms of misuses, we argued that we should be far more concerned about non-consensual deepfakes than disinformation attacks. Unfortunately, it seems like we were right.
Students are acing their homework by turning in machine-generated essays. Good.
By Arvind Narayanan
Teachers adapted to the calculator. They can certainly adapt to language models. While that adjustment will be painful, it will force much-needed changes to education. (This essay was written before ChatGPT was released, and anticipated an issue that would soon become pressing.)
2. Debunking AI hype
Through our research and explanatory articles, we’ve shown what goes wrong when we fall for AI hype and how to avoid doing so.
AI cannot predict the future. But companies keep trying (and failing).
While generative AI dominates headlines, predictive AI is far more consequential and has been proliferating: AI for making decisions about individuals based on predictions about their future behavior. It is used in hiring, education, banking, criminal justice, and many other areas. We co-authored a paper showing that predictive AI in all these areas has a strikingly similar set of flaws. Any application of predictive AI should be treated with skepticism by default unless the developer justifies how it avoids these flaws.
The bait and switch behind AI risk prediction tools
Predictive AI vendors sell these tools based on the promise of full automation leading to cost efficiency, but when concerns are raised about bias, catastrophic failure, or other well-known limitations of AI, they retreat to the fine print which says that the tool shouldn't be used on its own. We discuss case studies in healthcare, welfare, and social services.
Scientists should use AI as a tool, not an oracle
It is not just companies and the media who produce AI hype, but also AI researchers. A core selling point of machine learning is discovery without understanding, which is why errors are particularly common in machine-learning-based science. We published a paper compiling evidence revealing that an error called leakage — the machine learning version of teaching to the test — was pervasive, affecting hundreds of papers from dozens of disciplines. We think things will get worse before they get better.
We argue that predictions of bigger and bigger AI models leading to AGI, or artificial general intelligence, rest on a series of myths. The seeming predictability of scaling is a misunderstanding of what research has shown. Besides, there are signs that LLM developers are already at the limit of high-quality training data. And the industry is seeing strong downward pressure on model size. While we can't predict exactly how far AI will advance through scaling, we think there’s virtually no chance that scaling alone will lead to AGI.
We co-authored a paper on the promises and pitfalls of AI in law in which we categorized legal applications of AI into three rough areas. Our key thesis is that the areas that would be most transformative if AI were successful are also harder for AI as well as more prone to overoptimism due to evaluation pitfalls. In short, the hype is not supported by the current evidence.
Eighteen pitfalls to beware of in AI journalism
Journalists often uncritically repeat AI companies’ PR statements, overuse images of robots, attribute agency to AI tools, or downplay their limitations. We noticed that many articles tend to mislead in similar ways, so we analyzed over 50 articles about AI from major publications, from which we compiled 18 recurring pitfalls. We hope our checklist can help journalists avoid hype and readers spot it.
ChatGPT is a bullshit generator. But it can still be amazingly useful.
The philosopher Harry Frankfurt defined bullshit as speech that is intended to persuade without regard for the truth. This is just what large language models are trained to do. But there are many tasks where they can be extremely useful despite being prone to inaccurate outputs.
ML is useful for many things, but not for predicting scientific replicability
A prominent paper claimed that machine learning could be used to predict which studies would replicate, and one of the authors has suggested that such predictions could be used to inform funding decisions. We co-authored a response describing the many flaws in the model presented in the paper, and argue that attempting to predict replicability is fundamentally misguided.
Why are deep learning technologists so overconfident?
Deep learning researchers have proved skeptics wrong before, but the past doesn't predict the future.
People keep anthropomorphizing AI. Here’s why.
Companies and journalists both contribute to the confusion.
Generative AI models generate AI hype
Over 90% of images of AI produced by a popular image generation tool contain humanoid robots.
3. AI evaluation
Evaluating models on benchmarks was good enough in classic ML, even if imperfect. But in the LLM era, this has broken down. Chaos reigns.
Evaluating LLMs is a minefield
We released talk slides showing that current ways of evaluating chatbots and large language models don't work well, especially for questions about their societal impact. There are no quick fixes, and research is needed to improve evaluation methods.
The talk is based on many of the essays from the newsletter, including the following three.
GPT-4 and professional benchmarks: the wrong answer to the wrong question
GPT-4 reportedly scored in the 90th percentile on the bar exam. So there’s been much speculation about what this means for professionals such as lawyers. But OpenAI may have tested on the training data. More importantly, it’s not like a lawyer’s job is to answer bar exam questions all day. There are better ways to assess AI’s impact on professions.
Is GPT-4 getting worse over time?
We dug into a July 2023 paper that was being interpreted as saying that GPT-4 has gotten worse since its release. While we confirmed that the model’s behavior had drifted over time, there was no evidence that its capability had degraded. We explain why this might nonetheless be problematic.
Does ChatGPT have a liberal bias?
An August 2023 paper claimed that ChatGPT has a liberal bias, agreeing with Democrats the vast majority of the time. The media reported on the paper unquestioningly and the findings were red meat for Reddit. In another instance of real-time, public peer review, we exposed a long list of fatal flaws in the paper’s methods. It is possible that ChatGPT expresses liberal views to users, and the question merits research, but this paper provides little evidence of it.
AI leaderboards are no longer useful. It's time to switch to Pareto curves.
By Sayash Kapoor, Benedikt Stroebl, Arvind Narayanan
We present new research showing that one-dimensional evaluations that focus on accuracy alone are misleading, and advocate for Pareto curves can help visualize the accuracy-cost tradeoff.
New paper: AI agents that matter
AI agents are systems that use large language models under the hood to carry out complex tasks such as booking flights. But on top of the difficulty of LLM evaluation, it turns out that agent evaluation has a bunch of additional pitfalls that has led to overoptimism. Many agents do well on benchmarks without being useful in practice. In a new paper, we identify the challenges in evaluating agents and propose ways to address them.
How Transparent Are Foundation Model Developers?
By Sayash Kapoor
We present the Foundation Model Transparency Index, a project that aggregates transparency information from foundation model developers. It helps identify areas for improvement, push for change, and track progress over time. This effort is a collaboration between researchers from Stanford, MIT, and Princeton. We assessed 10 major developers and their flagship models on the index’s 100 indicators, finding that the average score is just 37 out of 100.
OpenAI’s policies hinder reproducible research on language models
OpenAI discontinued a model with three days’ notice — a model that was widely relied upon by researchers. While the company reversed course after an outcry, it highlights the systemic risks inherent in reliance on privately controlled research infrastructure.
Quantifying ChatGPT’s gender bias
ChatGPT shows a strong gender bias in some cases, such as arguing that attorneys cannot be pregnant. We quantify this behavior using a dataset called WinoBias. One important caveat is that we don’t know how often real users encounter this behavior.
Introducing the REFORMS checklist for ML-based science
To minimize errors in machine-learning-based science, we propose REFORMS (Reporting standards for Machine Learning Based Science). It is a checklist of 32 items that can be helpful for researchers conducting ML-based science, referees reviewing it, and journals where it is submitted and published. It was developed by a consensus of 19 researchers across computer science, data science, social sciences, mathematics, and biomedical research. It was later published in the journal Science Advances.
4. AI and public policy
The two of us are at Princeton’s Center for Information Technology Policy. We regularly contribute our technical expertise to AI policy debates.
Licensing is neither feasible nor effective for addressing AI risks
Some people have advocated that only certain licensed companies and organizations should be allowed to build state-of-the-art AI models. We argue that licensing is infeasible to enforce because the cost of training models is dropping exponentially. Besides, licensing will increase market concentration, harming competition and worsening many AI risks.
A safe harbor for AI evaluation and red teaming
By Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Arvind Narayanan, Percy Liang, and Peter Henderson
Independent evaluation of AI is crucial for uncovering vulnerabilities, but this is often prohibited by AI companies’ Terms of Service, or left vague. This has a chilling effect on research. We were part of a group of researchers calling for change through a paper and an open letter. The effort has been impactful both in public policy and in getting companies to make changes to their own policies.
Generative AI’s end-run around copyright won’t be resolved by the courts
The New York Times’ lawsuit against OpenAI is filled with examples of ChatGPT outputting near-verbatim copies of text from the NYT. But output similarity is almost totally disconnected from what is ethically and economically harmful about generative AI companies’ practices. As a result, the lawsuit might lead to a pyrrhic victory. It would allow generative AI companies to proceed without any significant changes to their business models.
Artists can now opt out of generative AI. It’s not enough.
We argue that generative AI companies cannot do right by artists without changing their business model, and give many examples of how AI companies externalize the costs of their products onto others.
Tech policy is only frustrating 90% of the time
We think there is nothing exceptional about tech policy that makes it harder than any other type of policy requiring deep expertise.
What the executive order means for openness in AI
The 2023 Executive Order on AI is 20,000 words long and tries to address the entire range of AI benefits and risks. It is likely to shape every aspect of the future of AI, including openness: Will it remain possible to publicly release model weights while complying with the EO’s requirements? How will the EO affect the concentration of power and resources in AI? What about the culture of open research? It’s good news on paper, but the devil is in the details.
Three Ideas for Regulating Generative AI
We were part of a Stanford-Princeton team providing policy input to the federal government. We advocate for transparency, holistic public evaluations, and guardrails for responsible open-source AI research and development.
Generative AI companies must publish transparency reports
We draw on the parallels between social media and generative AI, and call for transparency on how these tools are being used. Later we coauthored a paper (AIES 2024) with Stanford and MIT researchers in which we put forth a much more comprehensive blueprint for transparency.