Katja Grace on Slowing Down AI and Surveys
Katja runs AI Impacts, a research project trying to incrementally answer decision-relevant questions about the future of AI. She is well known for a survey published in 2017 called, When Will AI Exceed Human Performance? Evidence From AI Experts and recently published a new survey of AI Experts: What do ML researchers think about AI in 2022. We start this episode by discussing what Katja is currently thinking about, namely an answer to Scott Alexander on slowing down AI Progress.
(Our conversation is ~2h long, feel free to click on any sub-topic of your liking in the Outline below. At any point you can come back by clicking on the up-arrow ⬆ at the end of sections)
- Contra Scott Alexander On Slowing Down AI Progress
- Why Advocating For Slowing Down AI Might Be Net Bad
- Why Slowing Down AI Is Taboo
- Why Katja Is Not Currently Giving A Talk To The UN
- To Avoid An Arms Race, Do Not Accelerate Capabilities
- How To Cooperate And Implement Safety Measures
- Would AI Researchers Actually Accept Slowing Down AI?
- Common Arguments Against Slowing Down And Their Counterarguments
- To Go To The Stars, Build AGI Or Upload Your Mind
- How Katja Forecasts AI Risks
- What do ML researchers think about AI in 2022
- Katja’s Thoughts On Automation And Cognitive Labor
Contra Scott Alexander On Slowing Down AI Progress
Why Advocating For Slowing Down AI Might Be Net Bad
Michaël: You’re currently writing an answer to Scott Alexander. Can you explain who is Scott Alexander for people who don’t know him?
Katja: He’s a blogger. He writes about a wide variety of topics because he’s a very good writer and a good thinker.
Michaël: What’s his blog?
Michaël: What is Astral Codex Ten about?
Katja: Well, sometimes artificial intelligence, it seems like.
Michaël: I think he now has something called ML, it’s Machine Learning Alignment Weekly. [EDIT: It’s actually Machine Alignment Monday]
Katja: Really? I admit I’m not very up on reading blogs at the moment.
Michaël: Oh. But you’re still trying to answer one of his blogs?
Katja: Yes. This one seemed important.
Michaël: What’s the blog post you’re answering?
Katja: He wrote a blog post about whether we shouldn’t or why we shouldn’t try and slow down artificial intelligence progress. He’s noticing that lately, various people have been saying that maybe we should slow down AI progress if AI progress is a threat to the existence of humanity. I guess he outlines some reasons you might not want to slow down AI progress.
Michaël: What do we mean by AI progress here?
Katja: I think it could mean a few different things. AI is very broad. I think the thing that is maybe a threat to humanity is a narrower kind of thing, sometimes called artificial general intelligence or more agentic AI or superintelligence. But basically, something that’s ultimately very powerful and probably somewhat agentic in the sense that it has goals and pursues them in the world. I guess it’s somewhat of an open question, which AI progress is helpful toward that. I think it would be plausible that you just want to slow down a fairly narrow set of AI progress that is leading that way.
Michaël: Are you saying that, basically, making progress in AI in general might result in us building artificial general intelligence and, to not have this happen, we should just slow down the AI in general?
Katja: Plausibly. I’m also saying that perhaps to not have that happen, you don’t need to slow down AI in general, but just a subset of AI. Seems like an open question.
Michaël: I think this is important, because in the communities we are part of, the AI alignment communities somehow, and the effective altruism community. Sometimes people call it ‘EA’. In this part of the world, people don’t like to talk about slowing down AI progress. They prefer to try to accelerate research in AI alignment, solve technical AI problems. Why do you think people have been tabooing slowing down AI progress for the past 15-ish years? ⬆
Why Slowing Down AI Is Taboo
Katja: I think lately people have been more interested in talking about it, to be clear. In the longer term, I’m actually pretty unsure why it’s been talked about so little. I think there is some sense that it’s uncooperative or also maybe just fraught. Because if you want to slow down AI progress, your goals are contrary to some other people’s goals perhaps.
Katja: That sort of thing is potentially political. And I think maybe this community is full of people who’d rather do things that are not conflict-y or political, that’s probably one part of it. I guess I’ve talked to people a bit lately about why they’ve not been that interested in this kind of thing. Someone said something that surprised me a little bit, just a sense that people want to be pro-technology. They do love technology and it feels bad to be on the side of anti-technology on some particular important technology.
Michaël: Even more so in the Bay Area, if you live in the Bay, it’s pretty hard to be anti-tech.
Katja: Or especially for people who came to all this, maybe being excited about AI or something, or being excited about transhumanism or the future being amazing. I think relatedly, another one that surprised me was the thought that people are actually just pretty excited about a post-AI future that is really amazing. If we were to slow down AI successfully a lot, maybe we don’t get to that in our lifetimes. It seems nicer overall if we could aim for a solution where we get to a future full of flourishing quite quickly with the help of AI, but also, the AI is safe and it’s harder to aim for the AI being safe via sacrificing our own ability to live long and healthy lives or that sort of thing that might be helped by AI.
Michaël: Are you basically saying that some people are older or have longer timelines and think that if we slow down AI too much, there’s a small chance that they will achieve longevity or be able to live hundreds of years, and so they’re pushing towards AGI sooner?
Katja: I think there’s at least a bit of that. I don’t know. My guess is that’s not the main motive for why it hasn’t been discussed more. These are some things that I think are coming up for people.
Michaël: You shared with me some draft of the post you’re going to write answering Scott Alexander. In your draft, you summarize the conversation between people who have been very optimistic about AI Alignment technical research 15 years ago. It’s a discussion between those people and people wanting to slow down things right now. I feel like the summary was fun to me, but I don’t know if you remember how the conversation went.
Katja: I think it was roughly something like 15 years ago. Some people were like, “Oh man, AI. This sounds terrifying. We need to build an even better AI that needs to take over the world and also be perfectly good.” Onlookers are sort of like, “Wow, that sounds optimistic.” The people who are in favor of this are like, “Yeah, but it’s really important. The world is at stake and there are many of us and we’re pretty competent, maybe we can do this.” That’s really great. Then, after 15 years or something, they’re like, “Ah, this is pretty hard.” Some of them are like, “We give up.”
Katja: Then people are like, “Ah, what about this AI taking over the world and killing everyone though? That sounds like a problem. Maybe we should try and not build that.” Then I feel like a similar set of people are like, “Whoa. Yes, we think we can build an AI, to take over the world, that is perfectly good. We’re ambitious, but we’re not delusional. You think we can coordinate with China? That’s not possible.” I guess I feel like there are just very different standards here for how ambitious to be about these two different projects of how much of a can-do attitude to have toward them.
Michaël: I think another post by Scott Alexander is about fighting Moloch. This invisible hand of maybe, I don’t know if it’s about capitalism, but it is like humanity. And is doing bad things, but without actually having an agent doing it. I feel some people in the community think that fighting Moloch is harder than just solving math equations. I think I get what they mean, but I think it’s underappreciated how many things we can do in the political space to slow down things.
Katja: I think it also seems good to be clear on when things are Moloch or what Moloch looks like exactly. I think in this case, if you think you’re in the extreme scenario where AI is very likely to destroy humanity, I think it’s not in people’s interests to build it, for the most part. There may be corner cases where it is in their interests. I think it’s an easier situation to make it go well if it’s actually just not in people’s interests to do it. ⬆
Why Katja Is Not Currently Giving A Talk To The UN
Michaël: How do we convey this message to governments?
Katja: I don’t know. I’m not really an expert in that sort of thing. I’m assuming governments are made of people. I think the usual talking and stuff like that. That’s where I would start.
Michaël: What do you think the, let’s say the longtermist effective altruism community requires so many high standards in trying to come up with plans where we try to convince people that we should slow down AI progress, but don’t have the same high standards for, let’s say trying to build align AI. We can say to people, “Oh yeah, try to build aligned AI, that might work.” Whenever someone comes up with a plan to slow down progress, they’re like, “Oh no. You shouldn’t do this by yourself. Please have a much more rigorous plan for everything.”
Katja: I don’t know. I think that’s a really good question. I think maybe one thing going on is that if your plan is going to have to involve interacting with a bunch of other people, it’s easier for it to feel immediately like it’s in a normal reality where the normal rules of the genre apply. It doesn’t seem very plausible that your efforts to coordinate will work or something. Whereas, if it’s a kind of project you can do by yourself in your room, it’s easier to be really ambitious about it, I would guess. This is really off the top of my head.
Michaël: If it’s something you can do by yourself, like writing a paper or solving some technical problem, it seems like progress you can do by yourself that most people do. Whereas, if it’s something that’s never happened before and requires you to convince other people in the real world, then it seems more farfetched.
Katja: I don’t know if it’s about whether it’s happened before, but I’m thinking, for instance, if I set out to try and solve the alignment problem in my room, I feel like I could get somewhat enthusiastic about it. Maybe the fact that no one’s done it before isn’t that much of a hindrance to my enthusiasm. If you were like, “Hey, do you want to give a talk to the United Nations about this thing?” Then whether or not this is going to seem ridiculous to people, it weighs more.
Michaël: When is the Katja Grace talk to the United Nations?
Katja: It’s not on the agenda.
Michaël: Hopefully, we can do another episode after you talk to them. ⬆
To Avoid An Arms Race, Do Not Accelerate Capabilities
Michaël: Another thing you started writing about is the context of the arms race. This happened with the Cold War and might happen again with AI. How do you spread the message that there’s an arms race happening?
Katja: I think you mostly shouldn’t spread the message that there’s an arms race happening. That’s an easy communication problem, I guess.
Michaël: You don’t talk about it, rule number one.
Katja: Well, I do. It seems like the first question, maybe you should talk about it. The first question is, are you actually in an arms race? It seems not obvious to me that the AI situation at present is an arms race or likely to be an arms race. I think in the classic arms race scenario, it’s not the case that when you win the arms race, you get destroyed.
Katja: In the very simplified version of this arms race, it’s quite unlike a usual arms race. The usual arms race is a prisoner’s dilemma, basically. You always have an incentive to build more arms regardless of what the other player does. That’s not true if you get killed in the end. Then you can build more complicated models, and sometimes you should race perhaps depending on the details of it, but often, not.
Michaël: I guess every time you build a new nuclear weapon, you increase your offensive power, your retaliation power against, I don’t know, Russia or China. You also increase the strength of the war or the impact of the war if it happens. When you’re building AGI, there’s a small, sorry, depending on if you’re optimistic or not, some chance of you building misaligned AGI that could kill most people on the surface of the earth. This is a risk. I guess maybe governments just think this risk is quite small, so they’re not thinking literally it will kill them. They’re thinking there’s a small risk.
Katja: Whereas it seems like a common view among of AI safety community would be that there’s maybe more than 50% risk or something.
Michaël: Is that the common view?
Katja: I guess that’s the common view. I haven’t checked carefully.
Michaël: I think, for people who are building huge language models and working in machine learning, I think they would put it at less than 50%. Maybe they would put it between 0.1% and 20%. That’s what I get from them. I guess people who are on the doomer side would put it at higher than 50%, even closer to 90%.
Katja: That sounds right.
Michaël: I think they’re different risk assessments. I agree with your view, that it’s more risky than not risky.
Katja: In recent evenings I have made an elaborate spreadsheet model of arms races between people building different AIs, where the model is kind of like, well, if you can put your effort into winning the race or into more safety. I think this is unlike the normal arms race situation in that, why do you want to win rather than the other side? It’s partly because you think that you’re more likely to have an aligned AI. ⬆
How To Cooperate And Implement Safety Measures
Katja: If you’re slowing things down and adding to alignment, it’s not obvious that the other side won’t end up with a more aligned AI than you would’ve had. If you’re racing and therefore not having much effort to put into safety, then that could end up being a worse AI than if you contribute to safety, which then helps the team that does get ahead to be safe. Also, they’re slower because you’re not contributing to progress. I think depending on the details of the model, it can go either way on whether just unilaterally you should want to slow down, I think.
Michaël: You wouldn’t want to slow down everything and maximize safety or at least maximize safety overall. Maybe if some company is building AGI first, you might want to join them and help them make it safe.
Katja: I’m imagining that you don’t even necessarily have to join them if you’re doing safety research. Either you’re doing safety research that is generally helpful to other people who can read it, presumably you’re not keeping your safety research secret. Or you’re not contributing to progress. Progress is slower and whatever safety research is happening, there’s more time for it. It seems like either of those is helpful for whatever project finally gets there being more safe.
Michaël: That’s assuming that the first company to build AGI will read the paper and implement it. Let’s say if I’m running a safety blog post on lesswrong.com, I’m assuming that either other people will read it and it’ll be useful for other safety researchers or that someone at Baidu will end up reading the LessWrong post and be like, “Oh yeah, I should probably write this down.”
Katja: I guess I agree that the safety reason does need to get to the people. I guess in my model I was assuming that maybe 50% of it gets to them or something like that, or that was a parameter you could vary. If it’s down at 10%, then more often it becomes worth racing.
Michaël: I don’t know if you’ve played the game ‘Intelligence Rising’. It’s this role-play game by Shahar Avin at CSER. Where you’re like, every actor is a member of a company or represents a company.
Katja: I’ve played some sort of role-playing game passionately.
Michaël: I think it’s probably this one. I think it’s similar to the thing you were playing a few evenings ago, where every turn you need to either build a new safety measure, do research and safety, or research and capabilities, or you can attack and spy on other people. Hopefully, my episode with Shahar will be up before this one, so it will make sense for people.
Michaël: About coordination, so I guess if I were to role-play as someone who’s agreeing with you., I would say, “What about China? Have China ever agreed on anything at all?” Sorry. If we were trying to slow down AI progress for everyone at the same time and China was not cooperating and just trying to speed-run AGI, how would it play out in your scenario?
Katja: I guess I think they have agreed to other things. I don’t really know the details. I think also, often the things that people are wanting to agree to slow down on, they probably do net benefit the person or the country who wants to go fast. For instance, on climate change. My guess is that it’s net good for China in particular to release more carbon dioxide because it’s a tragedy of the commons type situation where the harms go to everyone. I think if the AI safety community is right, that AI is that bad, it’s much less clear that it’s worth it for China to do that. I think there’s also, separately, hope from them realizing that.
Michaël: So we need to convince them that if they build it, it might explode in their hands?
Katja: Yeah. If that’s true.
Michaël: Well, it seems plausible to me. Otherwise, I wouldn’t be doing this podcast.
Katja: But it seems like if, I don’t know, if you really think you can’t convince anyone in the world of a thing, you should probably wonder whether you’re right about it.
Michaël: How did we end up convincing China to slow down carbon dioxide emissions? If we want to replicate this, maybe it’s worth thinking about, how long did it take? Maybe it took 20 years to convince them. I’m not really sure about the history.
Katja: I’m afraid I also don’t know much about the history of such things. ⬆
Would AI Researchers Actually Accept Slowing Down AI?
Michaël: I’m generally curious about what we could be doing. Another argument that you mentioned in the draft is that, maybe it’s an argument from Scott, that we’re friends with a bunch of people in the AI community and we don’t want to upset them. If we were to slow down progress in AI in general, we might slow down progress at companies like DeepMind, OpenAI, Anthropic, in the US or the UK. And this might annoy people who are working hard to make progress on safe AI at those companies. What do you think about this argument from Scott? Do you think it’s true? Do you think it’s important to not annoy these people?
Katja: I think I disagree with the argument. I think there’s some value to not annoying people. I think it’s probably wrong to think of it as an alliance where, if we were to pursue slowing down dangerous AI, then that would be like stabbing them in the back or something. I think, for one thing, a lot of people who work on AI capabilities do think there are serious concerns with safety. So I think it’s very unclear what different people want here, and probably there are a lot of different views. It seems like, among people working on AI capabilities, I guess in this recent survey I worked on, a lot of them said they thought there was more than a 10% chance of an extremely bad outcome.
Michaël: More than half or something said there was.
Katja: I think it was like 48% said that it was at least a 10% chance. I guess we asked a few different questions about extinction because we’re unsure whether this kind of ambiguous wording about badness rather than extinction really meant. I think it was an “E.g, a human extinction”, but maybe they didn’t mean human extinction. We asked some more questions specifically about human extinction or permanently disempowering humans. The numbers were also 5% or 10% as the median.
Michaël: The people you’ve interviewed were mostly doing research in deep learning, machine learning, and they were authors at NeurIPS, ICML, those kind of conferences, right?
Katja: They were all authors at NeurIPS or ICML. We just wrote to 50% of the authors at those conferences.
Michaël: They’re in this pace of doing actual research, they’re like researchers. I guess one of the concerns about slowing progress is to convince people to do this. One thing is convincing researchers, and the other thing is convincing regulators or government people. I guess one argument could be like, how do we even convince those people to implement such drastic measures? In the sense that we haven’t managed to convince a lot of people yet. They’ve tried to convince people that AI risk was important. Technical AI alignment was important, and they have not succeeded after 15 years of effort. I guess that the main argument is, if they cannot be convinced of AI alignment, how do you convince them of shutting down their company?
Katja: I guess, in what sense has it been unsuccessful in convincing people of this? It seems like, among these AI researchers, a lot of them think that these are important problems. Also, comparing the last survey to this survey, I think there was a big movement in the direction of thinking that something like the alignment problem is important and hard and that sort of thing.
Michaël: I guess we can disagree on exactly where those guys need to be. I think the first survey was in 2017. Published in 2017, maybe questioned in 2016.
Katja: That’s right.
Michaël: Now we’re in 2022. Depending on how long are your timelines. I think we should have 50% of people thinking there’s a 50% chance that this is bad. There are at 10%. If there were a 10% chance of something bad happening, I wouldn’t be that worried about it.
Katja: Really? Like a 10% chance of the world being destroyed.
Michaël: It’s different from a 50% chance or a 90% chance.
Katja: It’s different in some sense, but I think on the question of, should we be putting huge amounts of effort into this? I think most people would agree that I don’t know, if you thought there was a 10% chance that the illness you just got is going to cause you to die, most people would freak out, I would say.
Michaël: Well, I don’t think they’re freaking out enough, otherwise, they wouldn’t be publishing NeurIPS or ICML papers on AI capabilities.
Katja: That’s probably right. I feel like getting people to agree to a thing is a pretty good step toward getting them to then take action on it. I think maybe we haven’t so much presented things for them to maybe do. Maybe they’re worried about it in some sense and there is potential for something like that.
Michaël: In your ideal world, they would sign something like, “Oh, this is an important problem.” Then we get those 500 researchers signing something and we just go to the UN and say, “Hey, this is important.”
Katja: I don’t have a good sense of whether that’s the best kind of diplomatic solution now. In general, I would think having a lot of the relevant scientists think that the technology they’re working on is dangerous is pretty helpful for regulating it in a useful way.
Michaël: Do you have any other ideas for lobbying? I have the idea of Greta Thunberg lobbying for climate change. It worked pretty well, right?
Michaël: Some people on Twitter have been doing this for ‘stop building AGI’. There’s this guy called Kerry who keeps tweeting something like, “Stop building AGI.” On every OpenAI thread. He’s getting some attention. Everyone I meet knows about this guy. I don’t know if you’ve ever seen one of his tweets.
Katja: Is this Kerry Vaughan?
Katja: I think I’ve seen a couple of them. They’re maybe not the ones in that style.
Michaël: Do you think people should go more towards that direction?
Katja: I think my tentative guess if I had to not think any more about it and come up with a guess, is yes. I think the thing that I’m more advocating for is thinking more about this seriously. My sense is that people mostly dismiss anything under the heading of ‘Slow down AI’ as soon as they can think of roughly one obstacle to it. And I mostly object to that and think that people should think more about it.
Michaël: The default position of like, let’s just err on the side of caution and think more about it and have more time and consider things more carefully.
Katja: Something like that, yeah. And including the default not being like, “Yeah, let’s just rush into a suicide arms race with China or something”. I feel like people too readily are like, “Oh yeah, that’s what we’re doing. We’ll probably die. But in the off chance that alignment isn’t important, maybe we’ll win instead of China. So all good”. ⬆
Common Arguments Against Slowing Down And Their Counterarguments
People Discuss It But Not Under This Uncooperative Title
Michaël: I’m just going to present some of the other arguments I saw in your draft that were kind of interesting. One is, “Maybe people are thinking about this plenty, but just not under the potentially uncooperative seeming title ‘Slowing down AI’”. Do you think that people will think about this but just not talk about it in terms of slowing down?
Katja: Seems plausible. It’s not that there’s enough thinking about it going on under that heading, and that this sort of explains the lack of talking about it. Just because I think over the years there have been a lot of people not thinking about it, as far as I can tell, and I think it would make sense for them to think about it. Just based on my own talking to people and being around them.
Michaël: Your experience from actually communicating with people outside of your room.
Katja: Sometimes. I used to. ⬆
The Faster the AI Capabilities The Smoother The AI Progress
Michaël: One other idea you’ve been hearing about is, “The faster AI capabilities work happens, the smoother AI progress will be.” And this is kind of good.
Katja: I haven’t thought through that a lot. But I think it does seem plausible to me that the major scenarios where someone grabs a lot of power are ones where someone, potentially an AI system, is a lot more capable than anyone else around. And so, if progress happens very gradually, then maybe you manage to get to quite advanced AI without there having been a step at any point where anyone was vastly more powerful than those around them and could take control of things. It’s all very abstract.
Michaël: Yeah, seems plausible. Another argument that people say a lot is that “By working on AI, you’re kind of making productive progress on alignment, and if you’re not building AI at all then you’re not making progress on alignment. And so you’re just buying time. But at the end of the day, you’re just dead”.
Katja: I guess my impression is that you can work on alignment without forwarding AI progress very much, or at least, that there’s a lot of space for varying levels of forwarding AI progress. With different alignment plans. But I don’t know the details of this very well.
Michaël: Some people think you would make most of the progress about AI alignment in the last few months before you build AGI. If this is true, then you could try to make a lot of progress but it won’t be the full progress you need. But I agree that generally, it will be better to have more time and not accelerate things too much.
Katja: Yeah. I guess, this thought that most of the safety will happen in the last few months, I think, comes from some sense that you can’t really prepare for things ahead of time or make progress on any kind of safety thing if you do it well in advance. I just don’t know how good the empirical evidence for that is. It seems to me like people very rarely try. I see less people just definitely failing.
Michaël: I think, if you’re trying to plan for the next 10 years and making concrete proposals for like, how to slow things down and what to present to the UN, it seems like a harder problem than just writing some equations about AI alignment.
Katja: Sorry, I think I didn’t follow that.
Michaël: Oh, sorry. I’m just saying that it seems harder to write down regulations and plans for the next 10 years than actually solve a technical equation.
Katja: Yeah. As in, there are some kinds of preparation you can do that are more helpful across a range of circumstances. Maybe they’re easier to do ahead of time. And other ones that are more dependent on circumstance, so it’s harder. ⬆
It’s Like Discussing Terrorism
Michaël: There’s a few reasons why people are not trying to slow down the AI progress. So one, we’ve already talked about is most people are pro-technology at heart. Especially in Silicon Valley. Another one is that when you talk about slowing down AGI, it makes people think about political moves, like some kind of terrorism or other non-cooperative conflicts.
Katja: Yeah, that’s my vague sense.
Michaël: Do you think that’s the main reason people avoid thinking about those things?
Katja: I think it might be a big reason. I think more generally, perhaps, people are thinking of it as a relatively radical-type move, slowing down AI. Whereas, I think technology has slowed down very often or scientific progress has slowed down. All of medical research is substantially slowed down by efforts to make it safer.
Michaël: We’ve seen this by how long it took to get some vaccines approved by the FDA.
Katja: For instance, yeah.
Michaël: So we should make AI regulation closer to medical regulation.
Katja: Maybe. ⬆
You Can Still Slow Down With Economic Or Political Pressure
Michaël: Another point you make is that there are some external pressures in trying to build AI. Maybe economic pressure or political pressure, because it might be useful for countries in the future. It’s not infinite. So those forces will not be infinite to the limit where we have weapons or other kinds of regulation of things. So, do you think people are kind of wrong in thinking that economic pressure is impossible to avoid? We could have other incentives that are stronger than economic incentives.
Katja: My sense is that people do treat such things as something infinitely strong. Like that, if there’s a strong economic motive for something happening, you can’t fight with it. Empirically, medicine is a good example, where there’s a huge amount of economic value we could have from medical things happening faster. I recently learned that strep-A kills half a million people a year, and human trials on a vaccine for that, I think, were just banned for 30 years. You might think, “Wow, there’s just so much value on the table, surely that would happen.” But, no.
Michaël: What’s strep-A?
Katja: It’s a kind of infection. I think it’s actually a thing that can give you strep throat here, which isn’t so bad, but a lot of people elsewhere in the world die from it as a result of other complications. I think heart complications. But I looked into this very briefly, so I could be getting things wrong here. ⬆
To Go To The Stars, Build AGI Or Upload Your Mind
Michaël: Someone who has been very vocal about this is Sam Altman. He’s been a contrarian to the slowing down AI movement or like just the longtermism movement. And he’s been saying, “Either we figure out how to make AGI go well or we wait for the asteroid to hit.” And then he said, “I’m optimistic we can do the former and I do not believe we can colonize space without AGI”. Do you think we should go for the stars and try to build AGI in our lifetime or not?
Katja: I don’t think that going for the stars requires building AGI in our lifetimes. The deadline for colonizing the stars seems like it’s not very pressing, as far as I know. I also don’t actually know the argument he has in mind for, “We need to have AGI to colonize the stars”. It seems hard to send humans out to the stars. So it seems like we need to have some sort of non-human version of something like ourselves. I guess something like mind uploads seem like a sort of AGI that we might eventually get that is more transportable. Speculatively.
Michaël: So, speculatively, you’re more excited about mind uploads?
Katja: So I think probably ultimately, I’m pretty excited about AGI and I don’t imagine that you would need to delay it for more than a hundred years or something. So I think it’s not really relevant. If it was important to get AGI to go to the stars then we could do that, probably. If, somehow, that was impossible and we couldn’t ever build aligned AI or something. Then I don’t think that would stop us from going to the stars. Or I guess I’m imagining that mind uploads would be aligned. As each one would be roughly like a human.
Michaël: So you build a bunch of mind uploads, and they’re roughly aligned, and you have many of them. So you ask them to solve alignment or you have much more power now.
Katja: I was imagining you just send them to space. But yeah. Probably, also they can do a lot of thinking. I guess I was imagining a scenario where, for some reason, it’s impossible to ever solve alignment. But I think that doesn’t seem likely to me.
Michaël: So you just send Katja Grace to space?
Katja: Perhaps, yeah. I hope so.
Michaël: What happens with those mind uploads? Do they just stay on the ship and then go on some planet and try to get more energy and travel more?
Katja: I imagine something like that. I haven’t thought through it much. But I imagine something similar to whatever other space colonizing creatures do, which I hope involves some sort of happy lives as well as aggressive colonization and grabbing of resources for more colonization.
Michaël: Is this where you see yourself in 100 years?
Katja: In space? I guess I hope. I’m not sure. ⬆
How Katja Forecasts AI Risk
Why Katja Thinks There Is A 7% Chance Of AI Destroys The World
Michaël: One thing you ask, I think at the end of your draft, maybe it will be at the end of the post when it’s published, is to have people think about it for a week. People have a bunch of justifications in their heads, some knee-jerk reaction to all of this. They have arguments, but they haven’t thought about it enough. So you kind of request they think about it for a week. How do we have people actually think about this for any amount of time?
Katja: I don’t know, that seems hard. I can’t reliably sit down and think about something for a week. But, maybe, desist from having strong opinions until you’ve thought about it enough to be more confident about them.
Michaël: So ask people to be more reasonable.
Katja: That seems generally good. I’d be in favor of everyone being more reasonable about everything.
Michaël: Is slowing down progress your main source of optimism?
Katja: No, I think my main source of optimism is probably that the AI safety community is mistaken about how bad this is. AGI is not going to destroy the world for sort of mundane reasons that other things don’t destroy the world.
Michaël: Wait, do you have more credence or optimism from people being wrong about AI?
Michaël: So, it means you’re not optimistic at all.
Katja: Wait, no, I’m pretty optimistic about people being wrong. People are wrong all the time.
Michaël: What would you say is the probability of people being wrong? I don’t think you want to say that on an affiliated podcast. But… Very likely, somewhat likely?
Katja: The likelihood that everyone’s wrong? People have different views about how likely doom is. My probability on, “AI destroys the world”, is probably something like 7%. So that’s a relatively high probability on some other people around being fairly wrong.
Michaël: So your P will be 7%? This is surprisingly precise. How do you come up with a number?
Katja: I wrote down a lot of arguments and put probabilities on them and then combine them. But also for fun often make predictions about things and put them in a spreadsheet. I recently got to 1,000. I put very precise probabilities on things. It’s better.
Michaël: I think it’s difficult to make predictions for less than 1% or more than 99%. It is good to be in the 1% precision.
Katja: As in, you think it shouldn’t be more precise than 1%?
Michaël: Yeah. I think when you’re trying to guess stuff on Metaculus, which is a website where you can make predictions, you cannot go less than 1% or more than this precision. Because a human brain cannot possibly be more than 99% confident on things.
Michaël: On topics on Metaculus.
Katja: I’m 99% confident about all kinds of things. For instance, when I eat a sandwich I’m more than 99% confident that it won’t kill me. Otherwise, I wouldn’t eat sandwiches.
Michaël: I guess, most of the Metaculus questions are kind of speculative, right?
Katja: Yeah, that’s fair.
Michaël: They’re pretty hard to predict. So you predict most of your stuff like eating sandwiches and will I eat this sandwich and then will this kill me?
Katja: I don’t have that one in my spreadsheet. It’s a lot of mundane things. Most of them are not really in the less than 1% or more than 99% range, though sometimes they are.
Michaël: I’ve tried doing those predictions as well. I wrote a short program on the computer where I can just type a letter and then write a prediction. But the question is, when do you actually review them? Do you do this on a daily basis in the morning?
Katja: I don’t have a particular habit, I sometimes get into it more or less. I’ve been doing it on and off since 2018, and there would probably be gaps where I didn’t do it at all.
Michaël: Do you actually recalibrate yourself? Things like, “Oh, every time I say ‘60%’, it means like ‘70%’”. So you try to go in the other direction?
Katja: I try to, yes. For things that aren’t my own behavior, I’m just surprisingly well calibrated, for the most part though.
Michaël: For things that are not your behavior?
Katja: Yeah. For things that are my own behavior, I’m not very well calibrated at all. For things are my own behavior, I’m not very good on 80%, but the other ones I’m… I think my average distance is about 3% off of the actual thing.
Michaël: That’s pretty good. Is this your Brier score or something?
Katja: No. Brier score takes into account whether you’re actually right or not. Whereas for instance, you can be well calibrated by thinking things are 60% and then they actually happen 60% of the time. Whereas, you would have a better Brier score if you actually thought things were 99% when they were true, for instance.
Michaël: You’re just well calibrated on how likely stuff is going to happen or not.
Michaël: This is pretty good. I think most people are not that well calibrated. The 3% is the average distance for every percentile or something?
Katja: Or for dividing it into like, 11 buckets. Things that are around 70%, say. If the average probability for things around 70% was, in fact, 70%. Or, if the thing that I said for them was, 70%, then like the real frequency of them happening is less than 73%, say.
Michaël: This is pretty crazy. But I guess there’s a bias of those are things resolving in your lifetime or at least since we’ve made this, right? So we have no guarantee that…
Katja: You should listen to my 7% of AI Risk.
Michaël: 7% plus minus three.
Katja: Yeah, it’s true. I have had no opportunity to test it on things that happen outside of my lifetime, alas.
Michaël: Alas. But it’s kind of interesting that you have a precise model. Do you remember the kind of things that explain this number? Is it like… what are the conditional probabilities? Do you remember off the top of your head?
Katja: That went into the 7%? I don’t remember what the numbers were, I remember somewhat what the questions were, yeah. It was the basic argument for AI risk, to my mind, and then I was putting probabilities on the different parts of it. I take the basic argument to be something like, “It’s very likely to be powerful AI sometime soon. It’s very likely to be agentic”. As in, there will be instances of it that have goals and pursue them. “If there are agentic, powerful AIs, their goals are likely to be bad.” And then, “If all of that happens, they’re likely to carry out their bad goals and destroy the world”. ⬆
Why We Might End Up Building Agents
Michaël: I don’t even know how to start deciding whether an AI will be agentic or not.
Katja: For these different parts, if you broke it down into more things, so it’s like, “Well, are there reasons to think it will be agentic?” I think one reason is that agentic things seem pretty useful. So for instance, if you could have a tool… Suppose you want to make some kind of marketing material or something. If you have a tool that you can use to make the marketing material, that’s kind of good. But if you can have a thing where you just set it free and it goes and knows what needs doing and does it, that’s more helpful.
Michaël: So it’s generally more useful to have general AI that does things for you instead of just doing one virtual thing.
Katja: Something like that, yeah. Or, it seems like the task of knowing what the goal is and then figuring out what things should be done for that goal is just a task. And if you don’t have to do that task, that’s better. So the economic reasons for people to make things that are goal-directed-ish. I think there are also some other reasons around for expecting things to be goal-directed. There are these coherence arguments that say, “If you’re not very goal-directed, then from the perspective of if you were goal-directed, it would be good to be more goal-directed”.
Katja: Maybe that’s an uncharitable description. An example of this, that’s simple, is imagine you have circular preferences, for instance. You like apples more than pears, pears more than bananas, and bananas more than apples. And for any of these trades, you’re willing to pay a dollar for the one that you like more. Then, I can take your money by continually trading these things with you, and maybe you end up with an apple and I got $3, pushing you around the circle.
Michaël: Can I have your banana? I have an apple.
Katja: Yeah. So you might say, well if you realize that you are a creature like this, who’s going to go around in the circle, then you should prefer to change yourself into one who is not like that so that you don’t keep losing these dollars. Because the dollars may be helpful for you getting any of these fruit that you like.
Michaël: If I’m not agentic, I should self-modify and become an agent. Or, at least, some agent with normal utility function that doesn’t suffer from those traits that extract money from me.
Katja: That’s the thought, yeah. I guess I don’t quite follow how this works for things that are not really agents. For instance, if I’m a creature who wants to go around in the circle and lose money, you could say that, “I would prefer to not lose the money”. You could also say that, “I like losing the money and I would prefer to be changed in some other way where I lose more money. Because losing money is equivalent to going around this circle, which I like”. Once you’re incoherent, in some sense, everything is equally good. As far as I can tell, it seems like you need a somewhat different argument to say that, “In practice, a creature should change into a particular different creature”.
Michaël: I feel like this creature who just prefers banana to apple and apple to pear, would just be so happy to do trades all the time. Doesn’t really have a concept of, “Having more money is good”.
Katja: Right, yeah.
Michaël: In my mind, if we build machines with deep learning, we will have machines that are of normal utility functions and not weird creatures invented on LessWrong.com. But I guess I might be wrong about all of this.
Katja: I think that’s another class of reasons to expect that things will be agentic, it’s just that if you make things using the current techniques, they’re likely to become agents. It seems like there are some arguments along those lines.
Katja: I don’t know enough about them to explain.
Michaël: Okay. My guess would be, that we might want them to have plans or use reinforcement learning or things that-
Katja: Yeah, I think that’s true.
Michaël: … end up building agents. I’ll be super down to have all your numbers and know your actual probability, but I don’t think you brought it and maybe that’s not the plan for now.
Katja: That’s true. ⬆
AI Impacts Answer Empirical Questions To Help Solve Important Ones
Michaël: The important part, except from slowing down AI progress in general, is that you wrote this very cool survey and it’s not really published. It’s on the AI Impacts website. But maybe soon there’ll be a paper?
Katja: At least the paper on the arXiv, or I guess we might try and actually get it published somewhere.
Michaël: Maybe a few words on AI Impacts, because this is what you do, right? I’ve listened to other podcasts you’ve done, one was 80,000 Hours and one with Daniel Fillion, where you explain a bit of the philosophy of AI impacts. So we might not go on about it for a while, but in short, what is AI Impacts about, and what do you think about when you work for your company?
Katja: The basic goal is to try and answer these questions about the future of AI that are important. Like, “Does AI pose an existential risk to humanity?”
Michaël: Yes or no?
Katja: Or, “What does it look like?” Is it going to be like a very fast intelligence explosion, such that suddenly out of the blue a single agentic AI is taking over the world? Or is the risk more like a giant economy slowly going off the rails and no longer being under human control in any way? Is it something else? Is it kind of like, misuse, accidents, that sort of thing? I guess the hope in general is to answer these questions via answering easier questions that bear on those things. And I guess empirically… We’re pretty interested in empirical things and often try and find things where there’s evidence from like AI that’s happened already or past technologies. That sort of thing. In practice, the research we end up doing is often about things other than AI.
Michaël: I remember there was something about cotton.
Katja: Yeah, it’s true. Where we put too much effort into learning about cotton gins in the 1700s at one point.
Michaël: More generally, when we’re trying to think about whether we will have slow or fast takeoff scenarios, you might want to look at trajectories of other technologies in the past and see how it has happened.
Katja: Yeah, but less because you expect that trajectory to be just like the other trajectories in the past, necessarily. Because maybe there are reasons to expect this to be different, somehow. But also to check those reasons for why you might expect it to be different. Like if people say, “This is much more likely to have giant discontinuous jumps in it because it’s a kind of software thing and software can just suddenly get much better”. Well then you can look at past software things and say, “Did they get continuously better?” That sort of thing. So it’s often more complicated.
Michaël: One thing I disagree with Gwern, and I want to get this out in the podcast, is that I think it’s likely that we’re going to get some Manhattan Project investments on AGI. Like some governments spending some marginal fraction of their GDP on building AGI. And he thinks that Manhattan Project was something very special. Like the US thought Germany had it or something. So do you think we can look at past examples, from the nuclear war or Cold War or something, to see what would an arms race look like or budgets from governments?
Katja: I definitely think looking at things that happened in the past is a great start for guessing what will happen. I think in general, looking at nuclear weapons as an example is a thing you want to do knowingly and thoughtfully. Because it seems like nuclear weapons are one of the more surprising technological things that have happened. It was a huge discontinuity. It was like we discovered a new form of energy, that doesn’t happen that often. And so I think there’s a risk just always looking at nuclear weapons as your one example of what technology looks like and then being wrong because it’s unusual in many ways.
Michaël: But maybe that’s the right reference class for AGI, right? This is something unusual, like a new form of energy.
Katja: Maybe, but I guess it seems good to actually check that that’s true. I feel like there’s maybe a circular discussion that happens where you’re like, “AI, it’s going to be totally different from anything,” or you’re like, “Why do you think that?” “Well, look at nuclear weapons.” You’re like, “Wait, why are we looking at nuclear weapons?” Well, because it was crazy and different for everything like AI will be.” And I guess it seems good to do more object-level checks, that’s the best analogy.
Michaël: Yeah. Look at cotton…
Katja: Cotton gin.
Michaël: Cotton gin, yeah. Yeah, and I think those are, at least, very useful estimates to have. If you’re trying to estimate future AI hardware spending, it’s worth thinking about how much spending we did in the past or if you want to extrapolate some trends in GPU price of the same thing. And if you have one of those questions, in general AI Impacts is the right place to start.
Katja: Maybe. There are a lot of those questions to have, and we’ve only covered a very tiny corner, but you might check it. ⬆
What do ML researchers think about AI in 2022
The 2022 Expert Survey on AI Progress
Michaël: But yeah, I think the thing I’m most excited about right now is that you have this new survey from 2022 on your website and you have this summary of results. Maybe just start with some context from these questions like what were the questions you asked in 2016 and are they the same questions you asked this year?
Katja: The questions are very nearly the same. They’re all the same except I changed a couple of questions slightly because they were actively misleading now. As in they implied that things were maybe impossible that were no longer impossible or something, so I wanted to stop them from implying that. And, I also added two questions about human extinction explicitly because previously, we had this question, there was sort of “divide the probability between very good, good, mediocre, bad, very bad outcomes” and in retrospect, it was somewhat too ambiguous what people actually meant by that, so I wanted to check.
Katja: Part of my goal in doing the survey was to make it extremely similar to the 2016 one. Both of them had really a lot of questions. Each person got a randomized subset of a very large number of questions, so the headline questions were about when there’ll be something like AI that can do everything that humans do and how good or bad that might be. There are also questions about intelligence explosions, whether things will go very fast after human level AI, some about what inputs matter for AI progress like to what extent it’s mostly hardware that has helped versus other things? Various questions about safety and whether people like safety or want more safety.
Michaël: Do you like safe AI?
Katja: Yeah, exactly.
Michaël: If you could have safe AGI or unsafe AGI.
Katja: Yeah, it’s important to ask questions. You really find out what people think. And then I guess, there are 32 narrow capabilities that people are asked about when will AI be able to write a song that’s as good as a Taylor swift song or something like that. Then some more as well that I’m forgetting.
Michaël: Some of the questions were trying to ask for timelines-
Katja: That’s right.
Michaël: … For when we will get some things or other things. So some were like, “When will we get good at Go, the game of Go?” I think this was 2016, right?
Michaël: With the same training as a human being. I think this was funny because I think people cared all about Go in 2016, maybe less now. And other things were just an average or maybe a different question about… Is it human level machine intelligence, HLMI?
Katja: High level machine intelligence. ⬆
High Level Machine Intelligence
Michaël: High level machine intelligence. What is high level machine intelligence?
Katja: When unaided machines can accomplish every task better and more cheaply than human workers, ignoring things where it’s intrinsically advantageous to be a human, for instance, being on a jury, and you know, where maybe you have to be a human or something, but it’s not like you have better skills as a result of being a human.
Michaël: So it’s advantageous to be a jury because we have laws for this?
Katja: Well, I guess at the moment we probably don’t have AI that’s good enough to be helpful on a jury as far as I know, but you could imagine that in the future, maybe a jury is required to be humans. In which case, if we had an AI that was good enough to do that task, but it wasn’t allowed to then, we’re not interested in that for this question. That would still count as AI can do everything even if it wasn’t allowed to do that task. And for all of this, the issue is when it’s feasible, not when it’s actually adopted in the world.
Michaël: I don’t know when we’ll get AIs as a jury, but the questions you asked a few years ago, and maybe you asked them this year again, were about writing a New York Times bestseller, being a surgeon, automating all jobs like full automation of labors and automating the job of an AI researcher. As I remember, the funny part was that becoming an AI researcher was sometimes more difficult than automating everything. I’m not really sure because now I’m looking at the graphs and it seems like automating everything is 125 years median whereas AI research is 80-ish years.
Katja: Yeah, at least the medians make sense.
Michaël: So what did not make sense? Was it the mean?
Katja: I actually don’t remember. Sorry.
Michaël: No worries. So about this human, sorry, it’s not about humans at all, high level machine intelligence, what kind of predictions did you ask? Was it, when will we get high level machine intelligence?
Katja: We gave them three different probabilities, 10%, 50% and 90%, and we asked them in what years that amount of probability would’ve been reached basically. And then, a different set of people, we gave them years and we asked them what the probability was that we would have high level machine intelligence by those years. And then, a different set of people again, we asked about not high level machine intelligence but full automation of labor, which is when AI can do all of the occupations. Which, to my mind is quite similar to when will AI be able to do all of the tasks. But it turned out to be much later.
Michaël: When you say all of the tasks, do you mean economically viable tasks like in people’s definitions of AGI sometimes?
Katja: I think I just meant all of the tasks.
Michaël: All of the tasks. Yeah, maybe you just disregard things that would require a high level of dexterity from robots like, I don’t know, walking like a human is the kind of thing, they’re kind of useless.
Katja: Maybe they’re kind of useless, but I think still the definition we’re talking about is just, it’s feasible to do them if you wanted to. But note that it’s not like there’s one particular system that can do all of them. It’s more like, there’s some system that can do any task that a human can do.
Michaël: Right. Okay, that’s one thing I was not understanding correctly. It’s not one general thing. It’s, as a whole, the entirety of things we can do with AI is more than what humans can do and cheaper for all of them.
Michaël: Do you think there are things that will never be cheaper? Okay, I imagine a world where we have those massive neural networks and maybe, they can do whatever humans can do possibly, but having robots that walk like humans costs, I don’t know, a hundred K. Whereas, for a human it costs nothing to just walk.
Katja: It costs nothing? It costs something to create a human and feed them or something.
Michaël: It costs something to feed a human.
Michaël: Yeah. You just pay someone to walk for you, it’s very cheap right? And I imagine a world where writing code is increasingly cheap or doing all white collar jobs is extremely cheap, but having actual robots carrying out things for us and being very flexible is expensive. So yeah, I just think we’re going to reach something recursively self-improving AI sooner than HLMI.
Katja: Right, because you think things robotics might be substantially later.
Michaël: Yeah. And just like every task that seems high demand, yeah.
Katja: Yeah, and I think that seems pretty plausible that recursively self-improving AI… I don’t know. It seems like technology, in general, is recursively self-improving, so I imagine that AI is already recursively self-improving. Surely, there’s AI that is helpful for other AI but a much tighter feedback loop of that seems quite likely to happen before AI can do all human tasks. I agree. Yeah, this is not necessarily the first time that AI is important or something. It’s just a somewhat definable line.
Michaël: Yeah, and that’s something people can understand without reading a few paragraphs about what is transformative AI or some more, that people have high knee-jerk reaction about AGI. If you ask them about high level machine intelligence, you don’t use the word human or general and maybe people have a better reaction.
Katja: I think it’s fairly hard to define any of these things in a really good and clear way that might point at the thing you want.
Michaël: That’s true. I feel like, for AGI people say the similar thing you said, but without the cheaper part, and they say “when AI could do all economically viable tasks”.
Michaël: I think that’s a fairly useful definition and also easy to define.
Katja: Right. I feel like it has a very similar problem that the important thing might happen well before that. I think all economically useful tasks are going to be fairly similar to all tasks.
Michaël: That’s right. That’s right. We can get the recursively self-improving part way before. ⬆
Running A Survey That Actually Collects Data
Michaël: But I think, in general, it’s a good thing to write precise questions and ask people about it. Yeah, I think that’s a huge effort into asking hundreds or thousands of researchers about it. How do you go about running a survey like this? Do you just write a script that grabs all those emails and just massively email those people?
Katja: Actually asked my friend Ben Weinstein-Raun to do that and he did, so that was great. Or to collect all the emails, which I think, I don’t want to say, just write a script. I think it ended up being somewhat tricky but it was a write-a-script-based strategy, not a manually-grab-them strategy. Yeah, and then write emails to them. I wrote them emails in a number of different bouts where I experimented with saying different things in the invitation and that sort of thing in the hope of getting a higher response rate because it was pretty low to begin with.
Michaël: How low?
Katja: I feel like initially, it looked like it was going to be two or 3%, whereas in the 2016 survey, it was I think, between 20 and 25% or something.
Michaël: That’s huge.
Katja: Yeah but sending out some reminders and waiting and stuff got it up to the 10% vicinity, but still, I was kind of hopeful for getting it higher. I think in the end it was, maybe roughly 15% or something like that, so not too bad.
Michaël: Yeah, I think 15% of cold e-mailing research is pretty good.
Katja: I also offered them money often.
Michaël: Money? How much?
Katja: It varied.
Michaël: In the paper, it’s very good. Offer of a thousand.
Katja: It was more, I wasn’t sure what was appealing to them and tried different… I guess, I initially tried offering what we did last time, which was a 10% chance of hundreds of dollars. And then I thought, maybe that wasn’t actually working very well. So I tried just a definite chance of a small amount of money. That sort of thing.
Michaël: Yeah, I don’t think people enjoy losing 90% of the time. But maybe people you select with a 10% chance of winning some money will be biased towards thinking there’s 10% chance of building HLMI. If they’re very confident, in one way.
Katja: Yeah. I guess I wasn’t thinking about anything like that. I think part of the reason for doing this thing is there’s just some logistical hassle like sending someone $25 or something. If you only have to do that logistical hassle for one person out of 10 for still the same expected number of dollars, that’s nice but ended up doing it non-probabilistically for most people so now we have a lot of logistical hassles, sending money out to people.
Michaël: Yeah. One question I’ve been seeing a lot on Twitter recently is would you rather have either 100K a year for the rest of your life or a 50% chance of I think, $3 million right now?
Katja: Wait, sorry.
Michaël: Something like 100K for the rest of your life or a 50% chance of three million right now. I guess it depends on what incomes people have right now. But I guess as a human being, you don’t really want to have a 50% chance of losing everything right and having zero.
Katja: Yeah, that does seem what humans are often like.
Michaël: Yeah, so I kind of understand why people were not keen on taking the survey initially. But after you did a lot of work and they answered, there’s not a lot of answers. Do you know about how many people could have answered? Are these more in the hundreds or in the thousands of people answering?
Katja: If I recall, I think it was roughly 700-ish.
Michaël: Roughly 700-ish? That’s a lot. That’s a lot of people. Do you think it’s a good sample size to do useful statistics on this?
Katja: I would guess so, but I haven’t actually done the statistics so.
Michaël: Oh, okay. But you’ve done the statistics five years ago or six years ago?
Katja: I didn’t do it on that either. On neither occasion, have I had much to do with the statistics. ⬆
How AI Timelines Have Become Shorter Since 2016
Michaël: If you have the number on the top of your head or written down, do you know what’s the average or aggregate forecast for high level machine intelligence, sorry, for a 50% chance of high level machine intelligence? What do people say, as an aggregate, for having a 50% chance of HLMI?
Katja: I think it was 37 years.
Katja: I think 37.
- This is a long time. This is a long time, but I also feel like you were asking for something very hard. One interesting thing is you can compare this survey with the survey you ran before, right? And you can know whether people have shortened their timelines or not, and they did, right?
Katja: Yeah, I think it was previously 2061 and now it’s down to 2059. So very slightly shorter.
Michaël: So slightly shorter. I thought that it was something like… Oh yeah, right because the time has passed as well, right?
Michaël: So six years after, or five years after is eight years less and so, it is just a two years difference.
Katja: Yeah, that’s right.
Michaël: It’s still some difference. And Gwern commented on LessWrong on the result and I think this is a tongue-in-cheek remark saying, “This is striking because if you buy the supposed Myers-Garrel Law that AGI is always predicted to happen a safe distance in time after one’s retirement, then the estimate should’ve moved from 2061 to 2067. In terms of net change, they didn’t move merely the naive change of two years, but a full eight years.” I wonder if it was a one-time shock due to the past few years of scaling.
Michaël: Do you have any thoughts on the Myers-Garrel Law that people should only predict things when they’re retired?
Katja: I do have some thoughts. I guess I don’t buy it. Partly because I guess in 2014 or ‘15 or something, probably 2015, I looked into this, we had a big data set of predictions people had made about AI over the years and I checked whether the Myers-Garrel Law appeared to have any support. It didn’t really seem like it in this data set. I think maybe the correlation was very slight in that direction, but it was extremely slight.
Michaël: Yeah. And I think there’s another correlation, people have made on LessWrong by looking at the data you released, saying that the more years people have been in the field, so the more experience you have in AI, the longer your timelines were. Very slight correlation, so I guess it’s the boomer correlation of the older you are, the longer your timelines are.
Katja: Yeah. I wonder if that’s because things have been faster recently, so if we’ve been around for longer… If you just average the rate over your time being around, the older people have seen slow times and recent fast times and they expect more slow times again or something.
Michaël: Yeah., your grandfather has seen slower progress in technology in general, so it just updates more slowly because he has seen more slow stuff in general.
Michaël: You also ask what’s the probability that longer-run effects of advanced AI will be extremely bad, meaning human extinction, and I think the result was something like 5%, I just said, so people are pessimistic about all of this. And I think this is surprising is that the level of doomerism or pessimism was constant throughout the years.
Katja: Or at least it had the same median between 2016 and now, as there was this other survey in between done by folks at Gov AI, where they got an answer of 2%, I think. It seems like their question was slightly different, so I don’t know whether that affected it. But yeah, overall, it seems like it’s constant.
Michaël: Yeah, and I think we should point out that for some of those questions, there’s some conditional privacy going on, so when we’re asking about high level machine intelligence, is it high level machine intelligence, HLMI? You ask them to predict this conditional on something else being true, which is human scientific activity continuing without any measure or disruption. You’re kind of assuming that business goes as usual.
Katja: Yeah, that’s right.
Michaël: What do you think would be the difference if you just asked them, “HLMI? Go”.
Katja: I guess there’s not very much difference and I think in retrospect, we probably shouldn’t have included that caveat. I don’t actually remember now what we were thinking in 2016. I could speculate.
Michaël: Yeah, I think some people put some probability of slowing down AI progress or some totalitarian regime happening where we start building AI in general.
Katja: I think probably, or my guess is that why we put it is partly because we expected some portion of people to have views along the lines of some random “other bad thing will happen”. Like climate change will destroy everything or something-
Michaël: Or nuclear war.
Katja: … Nuclear war. I guess there’s some general principle in asking people questions that you should try and ask them things that they know about. We wanted to not be asking them about their views on other world events and to try and keep it to, how’s the AI going?
Michaël: Yeah, so you try to ask nuclear experts about nuclear war and experts about AI. Seems reasonable. ⬆
Are AI Researchers Still Too Optimistic?
Michaël: I guess I have a more devil’s advocate question which is if the number of 5% median is still the same, have we failed as a community to communicate to those researchers that the AI alignment problem is hard?
Katja: It seems if the AI alignment problem is hard, I think a 5% chance of doom could be what it looks if the alignment problem is hard.
Michaël: Oh, sorry. Yeah, you’re at 7%. You think that the 5% is pretty good.
Katja: Yeah, maybe. But I guess I also think it’s not clear that the community knows how hard the alignment problem is and it seems like there are a variety of views on it. I also don’t know that the community has been trying that much to communicate how hard the alignment problem is.
Michaël: I guess, if the 5% or 7% numbers are closer to the truth, if you wanted to have people learn more about something, you should have their probability distribution converge or be more narrow towards the number. Is there, basically the same standard deviation? I know you don’t have the results on top of your head, but I guess, I would be more optimistic if the results were all around 5%.
Katja: I don’t know about that. It does seem like a good thing to check, but I did actually look at the… I guess we have these other questions. We describe something the alignment problem and ask, “Do you think this is an important problem? Is it a hard problem? Is it a valuable problem to work on at the moment?” And I think, for all of those answers, the distribution shifted toward it being important and valuable and hard.
Michaël: I think the question is, should society prioritize AI safety research more than it is currently prioritized? I guess people can choose between more or much more and if you take the number of people saying much more, I think it’s a 49% increase from 2016. I do have the number.
Katja: Yeah. That seems something. They are actually two different sets of safety questions though. That is one of them, which seems pretty promising for having opinions moved on these kinds of things. But then separately, there were these other questions that were in about a more specific description of what the problem is where I think the answers were also promising regarding people’s concerns towards safety.
Michaël: Do you remember those other questions?
Katja: I don’t know what the exact questions were, but I could tell you some of the numbers.
Michaël: Go for the numbers.
Katja: All right. For each one, I think people could put a probability on different… Oh no, sorry. They had five options for how important it is, say. For importance, the top evaluation of importance went from 5% to 20% of people who thought it was the most important. For the value of working on it today, the top category went from 1% to 8%. And for how hard it is, much harder than other things, went from 9% to 26%.
Michaël: That’s pretty convincing. You have good data to support your things. Yeah, I feel like people have started caring more, do you think it’s because of more AI progress and people are taking short timelines or medium timelines more seriously or do you think that they’ve read books from Toby Ord.
Katja: I don’t know, because it seems like their timelines did not actually change very much. I don’t think it’s that one, but I don’t know what has changed the opinions there.
Michaël: Maybe they’ve started watching Robert Miles. ⬆
AI Experts Seem To Believe In Slower Takeoffs
Michaël: Okay. Another good question is, takeoff speeds. You’ve talked a lot about takeoff speeds on the Daniel Finn podcast and you’ve tried to ask people what they thought would happen in 30 years with HLMI, so either things would go very fast or less fast and I think the median respondents say there’s like a 60% chance that the system will be vastly better than humans at all professions and 80% chance that it will increase capabilities dramatically. So they’re kind of thinking that slow takeoff, like 30 years takeoff, is kind of likely.
Katja: I mean, those numbers were saying that takeoff before that time is likely, so that would also include fast takeoff.
Michaël: So at least slow takeoff is likely.
Katja: Yeah, that sounds right.
Michaël: I would give it like 95% or 99%.
Katja: That by 30 years after HLMI, things are way better?
Michaël: Yeah. It seems like HLMI is pretty impressive, right. It can do like anything that humans can do and cheaper. I just don’t know what happens for 30 years. I don’t know why people say something like 60%, like 80%, what are the other people thinking?
Katja: Yeah. I mean, I guess they could be imagining that progress in everything kind of proceeds at familiar rates, I guess if we look at progress in chess AI, say, it seems like it took many decades to go from novice level to Grandmaster level. So if things are going at that rate, maybe once you’re passed novice level, nearly 30 years later you’re nowhere near Grandmaster level. So like maybe you’re not vastly better than the novice level yet and so it seems like you might imagine that progress would go much faster than, how fast it was in the 20th century for a chess AI but maybe there is room for disagreement.
Michaël: I guess the results from like AlphaGo playing with itself and going in three days from not knowing about the rules to being much better than any humans, without any cap because it can play with itself right. If the thing we build is close to a self-play agent, then we don’t have any cap on like humans do right. If the thing we build requires some human data and we’re actually bottlenecked by computer or human data, then maybe we might be stuck for ages.
Katja: I don’t know if the argument would involve there being any particular importance to ‘human level’. It’s just that you might imagine progress overall doesn’t just go arbitrarily fast. And so if we’re measuring from when it passes human level, it’s sort of not obvious that for any arbitrary amount better, that will happen in some short period.
Michaël: I guess the basic argument is just that if you can run on some hardware faster than humans, then if you double the compute, or some hardware is able to like do stuff twice as fast, then you can have double the human intelligence. So I guess the argument would be for 30 years, people have not thought of like, “Oh, maybe you should like add another GPU”.
Katja: I see. As in, if you would count having enough hardware that you’re running twice humanity’s worth of thinking as vastly better than human level, then it’s like, well that’s pretty easy, we just get a bunch more GPUs and probably can do that easily.
Michaël: Yeah or just that computers are not bottlenecked by working memory or speed or being in some kind of meat bag and so I guess if they can do any task that humans can do without any of the requirements we have, then I don’t really see any reasons why we should wait 30 years. But I guess those people were just people giving their rough guess at something in five minutes.
Katja: They answered these questions very fast.
Michaël: Then there was a question about fast takeoff, so two years after HLMI. And the actual answer was a 20% chance. I feel like maybe this is like the median guy would say 20% chance? So do you think those people have really small credence on fast takeoff or they were just answering fast? Like seems likely, seems less likely?
Katja: I think I don’t have any reason to particularly doubt their answer there in either direction. I think my prior guess would’ve been that they don’t have that high a chance on very fast takeoff.
Michaël: Mm, okay. So we previously talked about aggregate forecasts right? So I guess the question is how do you aggregate the things? You explained that you do something weird with some gamma distribution referring to fitting people’s distributions with a gamma function. So why do you do that? What’s the gamma function?
Katja: I wasn’t very involved in choosing the gamma function in particular. I think the important things about gamma distribution here are that it doesn’t put any weight on numbers below zero and so I think it’s often used for modeling how long until a thing happens because the thing probably didn’t happen in the past. I think I don’t know that much about why my colleagues chose that over other possible curves for modeling that, but it’s broadly a thing that kind of looks like, there there’s like more probability of a thing happening in some period and then it tapers off, which is perhaps intuitively fitting the kind of distribution we’re looking for here. There might have been other things going into the decision. ⬆
Katja’s Thoughts On Automation And Cognitive Labor
Automation and the Unequal Distributions of Cognitive power
Michaël: So, as you mentioned, you were not like responsible for everything in the survey and the survey is a small part of your work. You run AI Impacts but you’re also trying to conceptualize AI risk more and how it will evolve as a technology. So I think this is something you have some thoughts about, so you might write a blog post about it? You want to go on this, like how do you conceptualize AI risk right now?
Katja: Yeah, I guess lately I’ve been kind of thinking about it in a different way than previously. Lately I’ve been thinking about it as kind of like, that there are two important things happening with sort of human level-ish AI, where one of them is that there’s this giant new pile of intellectual labor happening in the world or cognitive labor. So previously, throughout history, there’s been a sort of allotment of cognitive labor per person, sort of like each person has a brain that they can do a certain amount of thinking with. That both sort of means the amount of cognitive labor per person has been pretty similar and also means that it’s distributed fairly equally. You can pay other people to do work and thereby some people end up being able to control more cognitive labor than others, but it’s hard for someone to just like sell all of their thinking time. So usually, everyone ends up with some decent fraction of what they started out with and things aren’t as wildly unequal as they can be for other things.
Katja: But with AI, there’s going to be just a fire hose of new cognitive labor and it’s sort of unclear where it will go. It probably won’t be distributed equally among people, so there could be much more inequality in using this cognitive labor. Cognitive labor is useful for all kinds of things like strategizing somewhat in contests between different people, for instance in politics. Seems like if you have a lot more thinking efforts to put into things, you can do better as well at things that are less competitive. It could be very disruptive for large fractions of it to be going to some particular narrow set of goals or either one person or one company or something like that.
Katja: I think the other thing that is sort of happening at the same time is that there are new agents in the world. So far they have most of the creatures around with goals, doing things. Many are humans. There are also animals, but they don’t have that much power in the world. With advanced AI, it seems like we’ll basically make something sort of like new alien minds that will be not very human-like and be able to have goals and do things in the world. It seems like these two different things will happen at the same time, which means that they’ll kind of be combined so that it could be that huge piles of cognitive labor go to these new minds and I think that’s where the giant risk is, which the AI safety people are most worried about. That it’s not even sort of human inequality it’s just like all the power goes to these new creatures.
Katja: Either of the things on their own, I think, would already be large problems. And are somewhat problems that other people might be worrying about. I think, if you just had the huge amount of inequality among humans and existing organizations, that would be potentially pretty bad. Or if you just have like a huge influx of new kinds of potentially psychopathic aliens or something. I guess if they don’t have any power, maybe that’s fine. But it’s already a kind of alarming and new thing in the world. Both of these are relatively new things.
Michaël: So the basic idea is that we’ll have a new species or new data minds coming that might have a lot of cognitive power and also like have goals and we need to either live with them or all the power will go to them or one agent might have all the power. I think they’re like a bunch of different problems, what is the one you’re like mostly interested in?
Katja: Hmm. I think probably the most interesting one is the problem that maybe some of them get a lot of the power. Though, I think if instead some humans got a lot of the power, for instance, if a lot of the AI ends up being not in the form of sort of agentic systems, if they’re more kind of tool AI, it could be that just some people get a lot of that. That, I don’t know, that could still be potentially bad depending on what those people are trying to do or at least be a situation where a lot of people are overpowered in some way, and it’s not very fair or good for people.
Michaël: So how is that different from the basic argument of “Superintelligence”? Like we might have systems that have much more than us and that therefore have much more power than us. How is that different than just thinking about systems that might be more powerful than us in general?
Katja: I think it’s helpful to me to sort of break it up into the two different things going on, which I think are separately novel. Being like cognitive labor and sort of new entities.
Michaël: So the new entities being not humans, and cognitive labor that companies could use to produce more outputs?
Katja: Yeah, or I guess cognitive labor is sort of used for all sorts of goals all the time. Like people use cognitive labor for figuring out what to do in their lives or making any kind of choice.
Michaël: Yeah, I guess the main difference is that you need bags of meat, you need like humans to do it right? So you need to ask humans to do it and they go to sleep and you need to wait. But if you can just invest all your money in like cognitive labor and capital then you don’t have dependencies on humans.
Katja: Yeah, that seems right and that seems like a big deal.
Michaël: So is that basically why you think AI is kind of a different technology? That there’s this whole part of cognitive labor that other tech before didn’t have. I don’t know, like steam engines or something that didn’t automate any cognitive labor. And now, with AI, we can automate all of this?
Katja: I think that is kind of novel. Though, it seems like the steam engine did sort of automate different stuff that humans did in other ways before. So there have been kind of analogous, exciting new things. I guess a long time ago, maybe most physical labor was done by humans and at some point, we sort of disconnected that from humans and now you can have huge amounts of physical strength without having to get any humans to go along with you and that was important, maybe to a similar extent, I don’t know.
Michaël: When do you think your job will be automated?
Katja: Mm, in 25 years.
Michaël: 25 years. 2047, hmm, seems coherent. It goes well with the prediction of a singularity. Yeah. And do you think you’re going to still be alive?
Michaël: So if people are interested in why AI is important as a new form of labor and the different singularities that form, sorry, different shapes of trajectory, singularity type one, singularity type two, and the different interactions between human labor and AI labor, watch my video with Philip Trimnell on the economics of transformative AI. It’s great.
Michaël: But as of right now, I do have questions I need to ask you because I posted stuff on Twitter being like, “Oh I’m interviewing Katja Grace about AI forecasting”. And I got a few questions that I find interesting and fun. So one, you need to imagine people posting stuff on Twitter. So one is just like a meme of a picture of monkeys discussing in that TV panel and the text says, “When AGI?”. So basically monkeys discussing when AGI will happen. So yeah, when AGI?
Katja: Okay, so be clear this is not the result of careful research on my part or something, this is my made-up number at this moment. 35 years.
Michaël: Just add 10 more years after Katja Grace is automated.
Katja: My job doesn’t seem that hard. ⬆
The Least Impressive Thing that Cannot Be Done in 2 years
Michaël: Going on podcasts and emailing people. Sorry, if, you’ve seen recent Yudkowsky’s tweets, you might have stumbled upon one of his questions. “What is the least impressive fit that you would bet big money at nine to one odds cannot possibly be done in two years?” So basically you get the favorable odds if the thing doesn’t happen in two years, in quotes because of AI. Not something else.
Katja: Right, so sorry. This is, what is the least impressive thing that I think there’s a 90% chance that is impossible, regardless of how hard I try or something? I guess I was a bit confused about the combination of ‘cannot happen’ and ‘90%’.
Michaël: Yeah, so 90% chance it cannot happen.
Katja: Okay, yeah.
Michaël: So I guess he tries to make it in form of odds because you would take a bet or something.
Katja: Right. I think it’s pretty hard to answer questions of the form ‘what is the least’ because then I have to think through every possible one potentially.
Michaël: Maybe just your number one.
Katja: Oh, a relatively unimpressive thing that I think won’t happen in two years, at least 90%, is writing assistance software that makes blogging better such that either my blog is notably better written by someone reading it. So if they look at like 2022 and 2024 blog, they’re like, wow, 2024 blog is substantially better written and that’s due to the writing tools or it’s like 20% faster for me to write. So yeah.
Michaël: Okay, I would take the bet for the 20% faster. I think we have tools that could help you write 20% faster.
Katja: What are they?
Michaël: I don’t know, they’re being developed by companies doing natural language processing and big language models
Katja: You think I will voluntarily use them in 2024 and they’ll be that good?
Michaël: Do you use the auto-complete of Gmail?
Michaël: I just think that by July 2024, the auto-complete of Gmail or those kinds of things for writing blogs will be insane. I don’t know, I’ve looked at your blog a little bit. It’s not always thorough research about AI. So sometimes it’s just like you blogging about things, right?
Katja: Almost always just me blogging.
Michaël: So if you blog about things, I know some people who would just use GPT3 to automate some of their writing, like what are some good prompts? Sorry, some good follow ups for my story? If they’re writing a novel, they could just even go to AI dungeon and they auto-complete the thing and get new ideas. This is kind of hard right? But it’s still a 90% chance. I would take the odds. How much money are we betting? A hundred dollars? I would take a 10% chance of winning 1k on this.
Katja: Yeah, all right.
Katja: So the question is like-
Michaël: No, there’s COVID.
Katja: It’s fine, but to be clear so it has to be one that I voluntarily use but not intentionally not use to win this bet right?
Michaël: Yeah, I think it’s too hard because how would I know right? How would I know you’re not lying.
Katja: Well, it seems like if it was, in fact, that useful for my blog, I’m not going to not use it over a thousand dollars. I have a lot of blogging to do.
Michaël: Or we could just look at how productive is your blog and how much more blogging has happened.
Katja: Well, I’ll be honest. ⬆
Michaël: Cool. So I think this kind of question leads us to thing that forecasting is hard in general.
Michaël: So forecasting is hard and something that Yudkowsky points out in his tweet is that if you want to predict something that will not happen in two years, it’s very hard to come up with concrete examples, or at least when you talk to machine learning researchers, something like, “Hey, what will not happen in two years? What benchmark will not be saturated by new research?” It’s very hard to come up with one. But your research is about AI forecasting and a question I have and people on Twitter ask about is do you communicate forecast timelines and surveys to governments or policymakers? Do people read your papers?
Katja: I don’t know. I guess I don’t personally try to communicate with governments directly in some sort of directed fashion, I mostly just put things on the internet and hope that they find their way to the people who might make use of them. There are a bunch of people who work in AI governance, who I hope are more on that, sometimes I talk to them about what sorts of things they would like to know.
Michaël: The great thing is that those people have been having conversations with me, so I have new stuff on AI Policy and governance coming up.
Katja: Oh, great.
Michaël: About the thing you released on the internet, do you have like specific things you think people should read more like your blog, your website, like papers, people should like read first when looking at AI Impacts?
Katja: I’m currently working on trying to put the best case I can for being worried about AI risk up on AI Impacts in a kind of organized fashion. I’m hoping that will be a good thing to look at, but it’s not up yet.
Michaël: Hopefully, it’ll be up when the podcast is up. About your answer to Scott Alexander on slowing down AI progress, where should we go if we want to look at it? Is it mostly on LessWrong on AI Impacts or on your blog?
Katja: My blog, “World Spirit Sock Puppet”, is the most likely place that it will be and my guess is that I will also crosspost it to AI Impacts and LessWrong and the EA Forum.
Michaël: Awesome. So read Katja Grace, she’s great. It’s been a pleasure to have you, and hopefully, we can do one more before your job gets automated.