There’s no question artificial intelligence technologies are becoming increasingly powerful; it’s clear to anyone who spends time on the internet. Given that, how freely available should this technology be?
Open-source advocates say more sharing boosts healthy competition and distributes power while furthering scientific research and collaboration. But open models also pose the risk of aiding nefarious uses of the tech, from non-consensual intimate images (NCII) to election interference.
A new Science paper from researchers at Stanford’s Institute for Human-Centered AI (HAI) aims to take a clear-eyed look at just how much marginal risk open models pose relative to their closed counterparts, as well as benefits and policy considerations.
We spoke with Rishi Bommasani, society lead at the HAI’s Center for Research on Foundation Models and co-author of the paper, about where AI is actually proving most dangerous, why openness is important, and how regulators are thinking about the open-closed divide.
This conversation, the first of two parts, has been edited for length and clarity, and contains references to materials related to child abuse.
Since you released a draft version of this paper about a year ago, have you seen mounting evidence of risks? How has that equation changed?
One of the key things we’ve been doing is trying to have this disaggregated view by the different risk vectors of where there is more accruing evidence and where there is not. So, to me, CSAM [child sexual abuse material] and NCII stand alone for the most part, where we have incredibly comprehensive evidence of marginal risk. So, the [Internet Watch Foundation] has a statistic here, where they talk about how the volume of CSAM on the internet has gone up about tenfold in the past year, year and a half. And this has two harms: One, which is the harm most people think about, which is that we want to stop the dissemination of this content on the internet or distribution by social media platforms or others. But then there’s a second harm, in the CSAM case especially, which is that law enforcement is tasked with the goal of trying to rescue children that are the subject of child abuse, sexual abuse, and, obviously, law enforcement capacity is not going to go up by a factor of 10…And so there is this paralysis where law enforcement is not able to identify which images contain real children being sexually abused versus synthetic children being sexually abused. And, of course, it only makes sense to try to rescue real people, and so that is a very concrete harm. And the other thing to say is that we have seen that a lot of the generated content is specifically from open models, like Stable Diffusion as an example.
Now that’s one extreme where I think the severity of that harm has increased, but the evidence of marginal risk was already there for a while. Harm that sits somewhere in between is concerns about voice cloning and other types of scams, where I think the capabilities of the models have gone up. I think there are still some questions of—and the FTC is looking into this—to what extent do existing defenses that we have in society against voice cloning address the kind of new concerns from AI versus not? And so there’s definitely more evidence than there was a few months ago.
Keep up with the innovative tech transforming business
Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.
And then the last is on bio. And there’s some recent developments, too, but what we saw in the last maybe six months, OpenAI, Rand, and the UK’s AI Safety Institute first conduct a series of studies where they generally found models didn’t give a statistically significant edge over existing tools like a search engine for this kind of bioweapon concern. Now, more recently, I think there’s still some newer developments, with the release of o1 and things like this from OpenAI. So, right now, I haven’t seen anything that’s materially increased the evidence base for marginal risk for bio, but I think it’s very much an active area where people are monitoring it.
Are there areas of risk where it’s more difficult to gauge evidence?
There are a few core challenges I’d name here. One is that when we’re thinking about malicious use, you have to sort of red-team things—you have to take on the perspective of an adversary. And our capacity to model the way malicious actors behave is pretty limited. Compared to other domains, like other elements of security and national security, where we’ve had decades of time to build more sophisticated models of adversaries, we’re still very much in the infancy.
The second is the resources of the adversary. So, cybersecurity is a good example here. Basically, there is a suite of different things you might worry about. You might worry about a low-resourced actor just doing low-effort attacks, and they might be targeting cyber infrastructure that’s just fairly weak, versus, at the other extreme, imagine some kind of nation-state conducting some large-scale cyberattack. And the problem is that…even though we might call them cybersecurity, these are vastly different for us to estimate what relevant evidence is. For example, an open model may not pose much marginal risk in the nation-state case, because the nation-state may already be able to build such a model on their own.
And then the third…is when we’re in the category of harm about malicious use, you very rarely accrue clear evidence of harm. Disinformation is a good example—like, we might know election interference is happening, but it’s often very hard to know who is the source of the election interference, and, specifically, whether they’re using an LLM or some other model, and you don’t have this kind of attribution. So, in terms of building evidence of concrete harm, we might be able to observe something is happening, but we might not be able to attribute why it’s happening or if AI is involved.