Siméon Campos on Short Timelines
Siméon Campos is the founder of EffiSciences and SaferAI, mostly focusing on alignment field building and AI Governance. More recently, he started the newsletter Navigating AI Risk on AI Governance, with a first post on slowing down AI.
In the tweet below Simeon summarizes the main claims he made in this episode and which ones turned up to be wrong:
Here's a summary of takes from this podcast which I think are relevant:— Siméon (@Simeon_Cps) April 29, 2023
1) If you're both technically very competent & socially skilled, you're very rare and cd probably have a very large impact.
2) A world post-aligned AGI poses big power asymmetries & misuse issues.
3) If… https://t.co/hlRgzR4jZP
Note: this episode was recorded in October 2022 so a lot of the content being discussed references only what was known at the time. For instance, we discuss GPT-3 (instead of GPT-4) or ACT-1 (instead of more recent things like AutoGPT).
(Our discussion is 2h long–you can click on any sub-topic of your liking in the outline below and then come back to the outline by clicking on the green arrow ⬆)
- Why Is Simeon Working On Alignment Fieldbuilding
- Coding AIs Enable Feedback Loops In AI Research
- Technical Talent Is The Bottleneck In AI Research
- ‘Fast Takeoff’ Is Asymptotic Improvement In AI Capabilities
- Bear Market Can Somewhat Delay The Arrival Of AGI
- AGI Need Not Require Much Intelligence To Do Damage
- Putting Numbers On Confidence
- RL On Top Of Coding AIs
- Betting On Arrival Of AGI
- Power-Seeking AIs Are The Objects Of Concern
- Concluding Thoughts
Why Is Simeon Working On Alignment Fieldbuilding
EffiSciences And SaferAI
Michaël: What are EffiSciences and SaferAI?
Siméon: EffiSciences is a field-building organization for AI alignment, which is aiming at taking people from universities and if they want to work on AI alignment, empowering them and helping them doing that, and more generally helping them doing impactful research. So that’s what EffiSciences is mostly about. SaferAI is an AI auditing organization focused on general purpose systems, so large language models, etc.
Michaël: How did you create both of the organizations? How many people are working on it so far?
Siméon: The first organization EffiSciences was built as an opportunity-based organization I’d say. I had a network among French schools called ENS, and so I just gathered people to create impactful research and we started that. SaferAI was more in terms of macro strategy. I think EA auditing is one of the most promising way to ensure that if at some point there are existing alignment tools, they will be implemented actually by top organization and top labs that are developing AGI. So that’s why I started building it. ⬆
Concrete AI Auditing Proposals
Michaël: What AI auditing do you have in mind?
Siméon: What I have in mind is at some point we’ll have some tools like interpretability tools or maybe adversarial-robustness tools and maybe even an alignment scheme, which is a full scheme for ensuring that your model is safe. And what we’ll want to do is ensure that the top labs such as Meta or Google Brain or DeepMind or OpenAI use these tools. And the aim of SaferAI is to be in the AI space and build some expertise and reputation in order to have the capacity to ensure that these organizations use these tools.
Michaël: Do you think in 2025 DeepMind will be audited by SaferAI?
Siméon: That’s the tail of the impact of SaferAI. It would be something like that.
Michaël: But recently you’ve been tweeting a lot about short timelines. So do you think your organization will grow as fast as progress in AI?
Siméon: So that’s conditional on fundraising, etc., but the aim is at least to grow in terms of impact at that base. And I think organizations can grow really fast in terms of impact if they take the right opportunities. And I think right now we’re in this window where it’s very feasible to push for more auditing on general purpose systems. So we want to seize the opportunity.
Michaël: And the goal of your other organization EffiSciences is to spread efficiency and science and do some community-building and outreach for EA organizations. So your goal is to grow the AI alignment movement and the EA movement to a reasonable size so we’re able to solve it.
Siméon: The goal is not to grow the EA movement. The goal is to just take important causes areas such as alignment, but also other risks and cause areas and bring talented people who have the required skills to work on these topics. And indeed alignment is an important part of that just because alignment is important and neglected. ⬆
We Need 10K People Working On Alignment
Michaël: How many people do you think needs to be working on alignment so that we solve alignment with a decent chance?
Siméon: I think that in general significant fields of science comprise of about 10K people. So I would expect that 10K people would be required. And the way I expect these 10K to be split is probably about a 100 to 500 people who are very insightful and generate many ideas and probably 9K people who are much very talented in terms of engineering skills and who have the ability to do a lot of interpretability research and to implement the ideas. I know that people right now who have plenty of ideas such as Chris Olah are bottlenecked, by their engineering abilities.
Michaël: Why is Chris Olah bottlenecked engineering? Can’t they affored to hire more engineers at Anthropic?
Siméon: My understanding is that it’s fairly hard to, you need some middle management, which is technical and that’s very hard to find. And it’s just second order information I heard about.
Michaël: How do you find more talented management? Do you just need to hire people from other organizations?
Siméon: The problem is people at AI safety are usually pretty nerdy and not very social skilled on average. So I guess it’s just pretty rare to have both people who are good at AI Safety and good enough to manage a team. And so you just need to have the look of having that intersection of two tails.
Michaël: Your community building with Effisciences will not be able to reach those people. You will reach more junior people and those junior people will not scale up fast enough.
Siméon: Actually some of the first programs we launched such bootcamp, there were some very promising people. One in particular was very agentic and very good in AI and coding and maths and also very into alignment. I think that it’s more something characteristic based than experience based. It’s definitely useful to have some experience to do some management, but if you are charismatic, I think it helps a lot and just need to receive some training probably to do it well.
Michaël: Your organization aims to do bootcamps and sometimes conferences to inform people about AI alignment and you also aim to in the bootcamps to make people level up in AI alignment by sometimes doing the Richard Ngo’s AGI Safety Fundamentals course. Have you had some success with people reaching some decent understanding of AI alignment after a week or two doing the program?
Siméon: We’re not exactly doing the AGISF, but something pretty close. We definitely had some success. At the end of our bootcamps, we asked them to read and do presentations on Infra-Bayesianism and Shard Theory. I think people who have good enough understanding to be quite worried and have some ideas, but don’t know what could be the next steps to contribute in the field would benefit from this.
Michaël: What’s the good next step?
Siméon: If you’re a very conceptual person read a lot of the literature, try to generate ideas, get a better inside view.
Michaël: A better inside view is very good. To get a better inside view, listen to his podcast.
Siméon: The third thing is upskilling for people who are more interested in interpretability so we have an incubator at the end of these bootcamps, which is aimed at providing these paths with some mentoring for people who want to keep going.
Michaël: And the goal is to target people from top universities. Your network mostly in France. But at the end of the day, the goal would be to reach top universities in the entire world. So maybe the next step is the US?
Siméon: Not necessarily only top universities, anywhere where there are people who are able to contribute from CS and maths usually. We are probably going to launch some collaborations with some US universities pretty soon.
Michaël: How different is from the committee building efforts that were launched for instance at MIT recently?
Siméon: It’s basically the same things except that we are doing a bootcamp that mixes theoretical stuff and applied technical skills and currently others are not doing that. I think it’s pretty good because for to have someone who’s good at alignment and who doesn’t go to working capabilities. ⬆
What is ‘Alignment’?
Michaël: We’ve mentioned ‘alignment’ multiple times. I think most people listening to this might not know what alignment is or have a very vague definition. How would you define alignment quickly?
Siméon: The simple version is, an AI is aligned if it does what its operator wants it to do. The sophisticated version is having an AI that doesn’t cause net harm for the world.
Michaël: So it’s more defining in terms of consequences.
Siméon: Yes. And that matters a lot because you can have huge misuses if you just have the first definition of alignment. There’s this problem that once people have an AGI, a personal AGI, they can probably kill hundreds of thousands of people if they want to, especially if not everyone has an AGI. And so you need to manage by some mechanisms this asymmetry of powers between those who have one and those who haven’t.
Michaël: Wait, you don’t have your personal AGI?
Siméon: Yeah, I’m so sorry. I’m old school. I still use the iPhone six.
Michaël: So when do you think we’ll be able to have our personal AGI’s?
Siméon: It’s very hard because the world post AGI is a big mess. You suddenly have someone who has so much more power than everyone else. Everything is conditioned on what this person wants to do with it. So there’s a huge variance on what will happen after that point. But if I just have to say something very vague, I’d say, okay, it’s likely that within five years in the best case scenarios, there’ll be some democratization of AGI tools. Probably five years after AGI in this conditional, my timelines is like 2032 or something like this. ⬆
GPT-3 Is Already Decent At Reasoning
Michaël: What’s AGI?
Siméon: AGI usually when people talk about it, is defined as an AI which is capable of having all the capabilities and particular reasoning capabilities that human have.
Michaël: If we have a language model that is capable of doing reasoning as good as Siméon, that I can just ask questions. Would that count as AGI or do you need the robotics part?
Siméon: No, I think it would count as AGI at least. Maybe some people would disagree, but I think that the part of general which refers to modality is not very interesting because it’s not what matters. Everything that make human much more powerful than other animals is nothing physical, not even just their ability to build things and to reason and to have conceptual views on the world.
Michaël: So the real concept people should be focusing on is some general reasoner that we can use to automate our economy or build new tool as build better AI. ś Siméon: I expect most of the value to come from this.
Michaël: You’ve been tweeting a lot about GPT-4 as well.
Siméon: Yeah. I did tweet about it.
Michaël: So you’ve done some polls trying to ask what capabilities GPT-4 would have. Do you think it’ll be able to do some reasoning?
Siméon: GTP-3 is able to do some reasoning, so that’s a trivial question asked. In GPT-3, what’s interesting here is that it was very good at doing a wide range of reasoning. What it’s not very good at is going in depth in terms of reasoning. One interesting behavior is like if you ask it a question and there is an obvious argument and a not very obvious argument, it will first state the obvious argument. But even if you ask it for another argument, it will keep repeating that, for instance. And so there’s this limited depth of reasoning that GPT-3 currently has and I expect scaling to solve partly that. So I expect GPT-4 to be able to manipulate different concepts of different importance with and weighing them pretty well, but still not forgetting them.
Michaël: Isn’t that a good thing at the end? Because if humans were able to just keep the key arguments, we’d have a much shorter debates and not record two hours of podcast.
Siméon: I guess probably. But for general reasoning, you probably need to have the ability to anticipate more subtle arguments that can have second order consequences to weigh things and come up with good strategies.
Michaël: It’s ironical that you’re saying this because you’ve done some very quick takes on Twitter about short timelines and instead of developing all your arguments, you just gave quick takes that were under 280 characters.
Siméon: That’s right. I’m optimizing the Twitter constraints.
Michaël: Why are you optimizing Twitter constraints?
Siméon: Because Twitter is just the optimal way of communicating for me. I often have plenty of ideas and I’d love to have some quick feedback on them and Twitter is the optimal way to do that. ⬆
AI Regulation Is Easier In Short Timelines
Michaël: When you tweeted about short timelines, what was the feedback you got?
Siméon: I guess I got fairly interesting arguments from some people that’s 10% of answers and then 90% is like, I don’t know, not very interesting arguments or comments.
Michaël: Do you remember any of the arguments that were interesting?
Siméon: I think there was something… There was an argument from Peter Wildeford for that caused me to go more in depth into Ajeya Cotra’s report, which is a report that tries to answer the question, “when is it most likely to have a transformative AI?” I don’t remember what she uses, but usually Transformative AI is defined economically. So it says something like you have a 50% increase in GDP the first year and then doubling over the next year. Probably not exactly that.
Michaël: I think she describes it as having the same impact as the industrial revolution, an easier, maybe more messy definition. And I guess this thing you said is a possible consequence of this. So you’ve been tweeting about short timelines. What do people mean when they say short? Is it five years, two years, 20 years?
Siméon: So personally, I mean something earlier than 2030.
Michaël: In seven years.
Siméon: Yeah, seven years. And I think this makes sense because there are pretty different optimizations between five to seven years, maybe even to 10 years and 20 and more years.
Michaël: What do you mean by optimization?
Siméon: What I mean is trying to solve alignment under short timeline is very different from solving alignment under long timelines. The governance dynamics and the landscape of AGI development will not look the same. Your strategy must be much more relying on existing actors and existing resources. You can’t afford to lose time in the short timelines.
Michaël: How would people lose time?
Siméon: I think on governance, they’re a few things. The first thing is, affecting regulation or policy takes a long time. So these are pretty useless probably in the five year timeline. Another thing that goes in that direction is that governments currently are not involved in the AGI race, but in 15 years they will probably. So right now you probably don’t want to spend too many resources on governance and you want to practice a lot more of your resources on corporate governance and ensure that within the main labs that are developing AGI, people are taking this very seriously and they have heard about it and they know the arguments and they will take any signal seriously. So that’s on the governance side, these are the main differences. On the alignment side, what it changes is that you will optimize for very different scenarios. So for instance, I think that the ideal alignment solution is a theoretical solution that maybe involves something like infra-bayesianism, which is a mathematical framework that tries to characterize things in reinforcement lending agents to be potentially able to bound what are the type of actions they take. And I think that these frameworks are doomed in the five years timeline, you probably can’t build a math theory, which is sufficiently strong in five years, but then the 15 years timelines are much more stronger. And on the other hand, ML safety stuff like adversarial robustness, trojan attacks, interpretability. All this is a lot more interesting because in the worlds where AGI is five years away, you want to optimize for the scenarios where you have the chance to solve alignment.
Michaël: How confident are you that we can just solve AI alignment, all technical problems or it is buying some time through AI governance measures?
Siméon: I think unfortunately we probably need both. I don’t know if giving it a probability would make a lot of sense. There’s a really high variance because currently there are very few stakeholders, big move can change a lot of the landscape. But I’d say that it’s probably something feasible. And the five years timeline, what is costly is time. So we really need to buy as much time as possible. So potentially try to talk to actors that are racing right now and try to make them go more slowly.
Michaël: Just go and ask China, “Hey, can you go less fast? Can you just slow down so we can build AGI faster than you?”
Siméon: When you look at the field, most of the huge things that happen come from a few labs. If these labs were putting more of their resources into alignment it would slightly slow down the race and it would be hugely beneficial for alignment.
Michaël: So how do we convince them to agree on this?
Siméon: I guess the first thing is just make the arguments. The problem you have with alignment is that it’s a very tough problem and you have a lot of inferential steps to go from like, oh, AGI will be cool to like, oh actually AGI can be bad. And so I would expect most people to not be able to do most of, a lot of the arguments. And so you just need to have some of their time in some time to make the important arguments. So for instance, I think that inner alignment is something which is very abstract and very confusing for a lot of people. And if inner-alignment is a real problem, then that’s probably the most likely way we will die. And so you have to make the case for inner alignment. ⬆
Why Is Awareness About Alignment Not Widespread?
Michaël: What is inner-alignment?
Siméon: It’s this notion that, okay, maybe a metaphor is more useful. When you go to get a job, you have an interview and usually the objectives of your interviewer. And so you as an optimizer, you are able to do as if you were aligned with the interviewer, as long as the interviewer is supervising you. And then after you have been hired, you can just start pursuing whatever goal was the reason why you wanted to get hired. So if interview ask you why do you want this job? You’ll say, oh my god, that’s amazing. That’s really great. And once you are have the job, you do nothing and you just start chilling just because they can’t lay you off or something like that.
Michaël: Isn’t this what everyone does?
Siméon: That’s common behavior among humans. And so what you expect is that advanced AI might have similar behaviors when where you test it, it behaves as if it were aligned with your goals. And when it’s deployed and less supervised, it acts according to its real objective function.
Michaël: Why do you think people are not convinced by AI alignment arguments? Because now there are plenty of resources online, many videos from about miles, a lot of introductions. You think people are just not able to go through those resources, they don’t know about those resources or they don’t have the time to read them? Or is it that there’s something about those resources that they cannot read it’s not technical enough, it’s not an archive paper. Why would giving them their arguments in person change anything?
Siméon: First, the people you want to convince are very busy and so they don’t have a lot of time to learn, so they have to be extremely picky about what they look at. Second, arguments for alignment are pretty theoretical usually and not very grounded in deep learning currently. People who think very abstractly easily buy them, but most people don’t. It’s an interesting problem, which is not trivial to solve. You just have to make all the abstract arguments way more concrete so that some people start being able to better understand what does a deep learning model, which is misaligned look like. And finally, there’s just the content which is being produced on alignment right now is shared in very not very high status formats. So LessWrong posts, Alignment Forum posts, YouTube videos, these are not very legible formats. And so having more literature from professors such as Dylan at MIT or David at Cambridge is very valuable for this to make it more legible and more readable by ML experts. Dan Hendrycks is doing great stuff in this area as well.
Michaël: So do you think the goal is to have more technical versions of those arguments?
Siméon: I think that it has a huge value to just take some concepts that are explained a bit abstractly on the alignment form and try to make them somewhat concrete in an Arxiv paper.
Michaël: Have you seen some evidence from talking to students and friends that they receive arguments better if they’re in paper format or scientific way?
Siméon: Definitely. We did a hackathon and we had free conferences and when we presented the conference people were receptive but a bit skeptic and we didn’t feel like we could easily talk about existential risk explicitly because it was a bit too far from what they could hear. And when Rohin came – Rohin Shah from DeepMind, he talked about existential risks and it was because he was from DeepMind, head of safety, people were like, “oh wow, this very high status person is talking about existential risk”. He seems very smart so maybe I should interact and try to understand better why he thinks that.
Michaël: So the bottleneck is finding more people that are head of some teams at some just companies and then we just like, can you go and give some talks at some hackathon?
Siméon: Something like that. And also papers and just things that are serious. Just a useful mental model I have is most people don’t have strong inside view on things because it takes time to have inside views.
Michaël: In this podcast, everyone has strong inside views.
Siméon: When people don’t have strong inside views, the best replacement for that is credentials. You just dress people according to the credentials. And so once you have that in mind, it makes sense that a lot of people, credentials matter a lot.
Michaël: What do you mean by inside views?
Siméon: Inside views mean a gear level mechanism of how it works or having a view on the causal mechanisms that are underlying some beliefs. So having all the machinery, which is the basis for a belief.
Michaël: And the opposite would be the outside view. When you just look at on average, how many people from top institutions say false things compared to the average person in the street. And you say according to the outside view, if you just do numbers, this person from the street is probably wrong.
Siméon: You take what someone says for granted and you just include it in your model, but you don’t have underlying mechanisms. ⬆
Coding AIs Enable Feedback Loops In AI Research
Michaël: I don’t really know how you manage to have short timelines right now. Because I’ve known you for maybe three or four years now, and at the beginning, you didn’t really consider AI alignment to be a huge issue. And maybe your timelines were 20 or 30 years, I don’t know, there was no Cotra report at that time. But right now, you’re one of the person who has the shortest timeline on the planet. What made you update?
Siméon: At first I was into climate change, so I think my timeline which was unspecified. And then I got into, more into AI and heard more of the augments. And in particular, what changed I think was I looked a lot more at the empirical literature. So I got a lot more views on how fast the things were happening, how surprised I was at the pace of progress. So, I guess it was the beginning of me caring about timelines.
Michaël: Hhow surprised you were?
Siméon: Things were going so fast and I played with GPT-3. The reasoning of GPT-3 is I’d say, very good, probably at par with some young children’s reasoning. And so I was like, “Oh, I would have expected that the hardest part would have been this one rather than going from the five year child to intelligent human.”
Michaël: To be more precise, you were impressed by InstructGPT, the latest version. And I think you wrote multiple blog posts on LessWrong about the performance of those models, and you’ve shown how better it’s become since 2020. Now it’s much better. And you’ve shown some examples of how it was presenting different arguments. So was this the moment when you decided that you would just have the highest impact in the short timeline scenario?
Siméon: There are other reasons that made me converge more and more towards short timelines. There are also coding AIs. I updated quite late because it were pretty obvious I think, if you had the strong inside view that coding was not too hard once you have good text models.
Siméon: But I updated on AlphaCode. I shouldn’t have if I had been better. I was just like, “Oh, this is not huge. This is not rocket science, and it’s already decent.” And so if you optimize hard for coding, which we will because there are very strong incentives to provide very good coding AIs, why wouldn’t we solve it in two to three years? That’s another thing I’m worried about another line of work because with coding there’s a whole new world which is opening. There’s also the self improving mechanism where ML engineers are working faster so they develop better models and so they just can then develop better coding AIs which then make a loop.
Michaël: Isn’t there some diminishing returns in it at some point? Doesn’t matter if your AI makes you 10% faster if you’re not able to get the 11% faster as quick. So if the increased performance is more and more costly, then it doesn’t matter if the thing makes you more effective. The question is it making you more productive at the rate required to keep up with some exponential growth? Otherwise it’s just like I said, something maybe a power law or something then but doesn’t keep up with maybe rate of progress in hardware. ⬆
Technical Talent Is The Bottleneck In AI Research
Siméon: I expect the progress to be linear exponential at first be pretty, before being logarithmic, and I just think that engineering will likely remain bottleneck as long as we’re in the scale paradigm.
Michaël: You don’t think we have enough talented engineers in the world?
Siméon: My understanding is that currently most top labs are-
Michaël: Have the same management problem you mentioned for Anthropic?
Siméon: Not sure about that, but my understanding is that what is scarce is engineering abilities for very large language models because there’s no way to train it on your personal computer. So very few people have the experience and there are a lot of accumulated tricks and experience that you need to get it right and I think that’s partly explained why Meta chatbots sucks so much. That’s just because they probably don’t have the same engineering skills as OpenAI.
Michaël: If your company has not done large language models before, they will not be able to do it now because they don’t have the past experience. But surely after one or two years they will have done some errors and learned from them.
Siméon: Probably after a few years they will have better experience.
Michaël: But maybe if they fail at doing the chatbot, maybe they just give up and go somewhere else and start stop doing large language models.
Michaël: On top of just cutting AI and large language models, you’ve also updated because of how we could plug in RL on top of large language models or entire systems of large language models. ⬆
‘Fast Takeoff’ Is Asymptotic Improvement In AI Capabilities
Siméon: A few months ago, I did updated on fast takeoff and some other things.
Michaël: What’s fast takeoff?
Siméon: Fast takeoff is the notion that you might have at some point a new learning mechanism that suddenly produces an asymptotic improvement in the model’s capability. So you go from a pretty straight line improvement and then boom, suddenly you have a inflection point and you just go in the singularity type scenario with something which is much smarter on a very timescale.
Michaël: So what happens just is able to self-improve or just has a huge peak in performance across all tasks.
Siméon: If you get a new learning mechanism, you just get so much more from each single datapoint you’ve seen so far that just suddenly you have lots of capabilities that make you go from subhuman to superhuman. That’s the best example of this is when you go from genes as a learning mechanism to culture, the pace of improvement is so fast that it’s a game changer and that’s probably why humans were much more powerful than apes is because they developed this cultural mechanism which is a much faster learning algorithm.
Michaël: So the idea for systems of large language models is that they’re going to have culture like humans have and they’ll be able to just exchange some information very quickly and thus have this group or collective intelligence.
Siméon: I don’t have a certainity on the fact that you can’t learn a lot just by exchanging prompts at fast pace. Models already have this mechanism of in-context learning and so if you had a long enough context length, you could just have hundreds of models that are interacting with some models that are specialized in subtasks, some that are involved in information gathering, other that are summarizing the information and that way you may be able to learn pretty quickly and to interact with the internet and get more experience, more data iterate, have some models that generate some proposals let’s say of planning, for instance, on number one that evaluates and that says, “oh this is bad and you should improve”. I’m slightly worried that this can be a very fast learning mechanism.
Michaël: Wasn’t this the mechanism described in the “Comprehensive AI Services” proposal by Eric Drexler?
Siméon: I’m not sure because I don’t think he was thinking about language models in particular.
Michaël: Language models are a subcase of this.
Siméon: Potentially, I don’t have in mind very clearly what he was describing. My understanding is that he was describing narrow AIs that were interacting to build a bigger thing. Large language model will keep being pretty general, but just if you make them interact, I wonder whether you can’t have a new learning mechanism.
Michaël: So what we’ve seen so far is that it costs a lot of money to train a large language model and possibly the front runner will have their model that they train, possibly they will deploy it in different ways to do inference maybe for some API. So we will have probably the same model deployed in… So not be some culture of different actors that will have some diversity, but mostly for me to be the first person to develop something close to AGI will maybe make copies of it, but it’ll be the same thing, right?
Siméon: In the context you describe, yes, that’s right. But the way I envision it is when you have a model which is pretty good, when you train a model, it takes a lot more compute to train than to do inferences. And so when you train a model, you can do instances of 10 to a hundred version of this model. And so what you can do is potentially finetune some submodels on some specific stuff and just create the system where the model calls itself iteratively to do some tasks that are more complex than just one inference can allow to do.
Michaël: The team of language models is able to code by some teamwork.
Siméon: Exactly. Everyone develops sub components.
Michaël: And do you have any paper for this or is it mostly your intuitions?
Siméon: It’s intuitions. You might have a model which is capable of evaluating whether a claim is racist or not racist for instance, but a model which is not capable of not generating not racist things. And so you can have your two models, one generates the evaluates and then you improve the generator model with the evaluation from the evaluator. And you can imagine this mechanism deployed for very, very wide range of tasks. ⬆
Bear Market Can Somewhat Delay The Arrival Of AGI
Michaël: So it’s similar to what we had with augmented conditioning from Paul Christiano where you have some supervisor to create other copies. So another reason why I think people are getting more and more bullish about short timelines is that there are massive investments. We haven’t yet seen the investment top out and we’re still much lower than what GDP allows us. So it is one of the reasons why you think we might get AGI in 2025?
Siméon: So maybe bear markets will change that, but I think that typically Alphabet or Google could probably scale their models by three orders of magnitude over a few years, probably three years, just because they can throw 10 billion if they want on a single training run. If they think it’s likely to lead to AGI.
Michaël: Could they just get 10 billion in cash or…
Siméon: I’m not like a hundred percent confident I’m 60%, but I think they have a hundred million. There’s this myth maybe not myth, that they have a hundred billion that they can put wherever they want.
Michaël: If they invest that much money into just selling current language models, we could get something that is much better at reasoning than GPT-3 or maybe GPT-4.
Siméon: When you go three orders of magnitudes smaller than GPT-3 and you have very dumb things, it’s even dumber than BERT, which is already pretty dumb. If scaling laws work like qualitatively, if you have the same difference between BERT and GPT-3 as you have as between GPT-3 and a 1000x GPT-3, I’m not sure we are still in the realm of subhuman AI. ⬆
AGI Need Not Require Much Intelligence To Do Damage
Michaël: Do you think we need to reach every human AI to kill humans or to have an impact or can we just have subhuman AI or just human level AI that can kill us?
Siméon: Unfortunately subhuman, median human AI, maybe 25th percentile human AI can probably kill us. And that essentially comes from the fact that the features that make AI much more powerful than humans is not only that you can train something more intelligent that you can literally copy paste it. So if I imagine taking Michael and copy pasting him a thousand times, we wouldn’t need any more podcasters. We could just literally have thousands of podcasts all around the world and have the best podcaster everywhere. And that would just be so much more powerful than needing to retrain a human during 20 years every time to get a new podcaster. And so you can imagine having an AI which becomes pretty good at maths PhD level and it can have a thousand PhD level math students and just so literally the systems becomes probably the most powerful math person on earth.
Michaël: That suppose that you can just supervise a thousand agents and use them productively. Not sure if you know can just become a 100x smaller by just creating a dozen copies of yourself. And if it’s not even tractable in terms of compute.
Siméon: You can probably, as we’re describing, we can probably, if you’re in the realm of large language models, you can probably prompt them to do some certain tasks and can just distribute the tasks to a thousand models via prompting.
Michaël: So that assumes the tasks are somehow parallelable or can be distributed more efficiently.
Michaël: And I think it’s not true for most tasks, if I give you some hard equation, I don’t think being able to split yourself in a thousand copy of yourself-
Siméon: I think you will be able to do some search which is pretty distributed. I think that you will need some synthesis at some point and maybe there exist, certainly there exist math things such as that. The single system which has to synthesize the information must be below beyond certain level of intelligence. But I think that split splitting help for any type of search you might want to do
Michaël: Is search answer for most problems.
Siméon: A good AGI is search, good planning, and execution.
Michaël: Well, I would argue that you need good algorithms as well. If you just have a lot of search but the problem scales exponentially, then you cannot get an exponentially high number of copies of yourself. If you’re trying to solve a travelling sales man problem or something like NP complete, then I don’t know, you need to do something smarter than search.
Siméon: By search I mean a very large class of algorithm. You can search very intelligently. So for instance, better RL algorithms such as like AlphaTensor are just better at doing search in a smart way, not exploring too many branches of the tree. ⬆
Putting Numbers On Confidence
Michaël: Was AlphaTensor a big update for you?
Siméon: No, it wasn’t an update at all because it was a narrow system, just a bit better than AlphaGo, but no, it wasn’t really an update. I mean it’s-
Michaël: AlphaTensor is the thing that does matrix multiplication.
Siméon: I mean it’s a more interesting problem and they developed some subcomponents that were very interesting, but it doesn’t seem like a push towards AGI particularly.
Michaël: Just so that you can get some algorithmic make improvements in how much compute you use and matrix multiplication is the core of what is used for deep learning. So even for this core thing, we can still make it more efficient. Of course there’s some new instance here and how practical it would be to implement this in practice, but it still shows some evidence of AI helping us with those basic things.
Michaël: So when we talk about updating, we talk about updating your timelines, updating your models. We’ve been talking a lot about AGI, but I guess most people want to know what’s your 10% chances of AGI, 50% chance of AGI or 90% chance of AGI. So if you have those numbers in mind, or maybe I think most interestingly would be how you reason about those things, could you give us your 10%, 50%, 90% chance of it happening for both AGI, promotive AI, if it’s any different? Or I think for me it would be more interesting to know your estimate for self-improving AI because when we reach self-improving AI, we could get this fast takeoff scenario.
Siméon: I feel like the important part is feeling the probabilities. So 90% basically means certainty in terms of subjective feeling.
Michaël: It’s more like 99%?
Siméon: Concretely because of mistakes that you might do, just being overconfident, forgetting factors, unknown and knowns. In general, for me certainty looks more like 90%, so 90% is probably 2000 a hundred or something like that. It’s not very relevant. It’s not… Just certainties difficult to reach. 50% is this weird number where you need to be wrong half of the time. And so you need to really constrain your subjective feeling to just literally jump off a cliff, be wrong half of the time. And what my subjective feeling plus arguments look like for this is before 2030, mostly because of the things I said and also because when I just look at the pace of deep learning within 10 years, what progress we have done have a hard time believing that we are not beyond halfway of the path to AGI.
Michaël: So if I flip a coin half the time, we will live in a world where there’s no AGI after 2030, but half of the time every time it drops on heads, there’s AGI before 2030. So in the next seven years there’s AGI.
Siméon: This number came up with because I updates on the fact that I have very smart friends that I respect who have long timelines. My own strong inside view is probably even shorter than that.
Michaël: This is the perfect podcast to talk about your strong inside view.
Siméon: When I just try to forecast capabilities and when I try to think concretely about what is the weakest thing that can potentially take over the world, I’m more around 2025. Mostly because I-
Michaël: In two years, right? Because we’re at the end of 2022.
Siméon: Three. Three.
Michaël: We’re at the end of 2022, so 2025. Oh, you mean the end of 2025?
Siméon: A few years. The main reasons is that text is enough to have a system which is capable of taking over the world if it wants to. And I think that most of the problems that are unsolved will get solved probably in the next few years with text. So long context length, that something people are working on, they expected to be most likely solved by next year. Long-term memory, learning from on the fly and being able to leverage what you learned in the past for your new inferences. I expect this to be solving in two years coding AI, like superhuman coding AI. I expect this to also happen with 60% chance within two years. ⬆
RL On Top Of Coding AIs
Michaël: Why two years?
Siméon: I guess mostly because there’s a lot of investment right now in coding AIs. There are strong incentives to get it right, as I said earlier and ee haven’t started scaling a lot models on coding. We haven’t tried Chinchilla-Optimal Scaling Laws. We haven’t tried RL yet, at least on the biggest models. So there are some loe-hanging fruits around that people will pick and then expect this to have a very non-trivial and so I say 60% of and allowing to become a superhuman. Also, I think that coding is fairly good because it has two properties. One is you can verify whether it’s true or false pretty easily. So it’s feasible to, it seems feasible to build RL rewards for this. So it seems feasible to do RL, which is usually what allows models to become superhuman. And there’s the data GitHub, which allows to reach a fairly good base level, and so I expect that the combination of the two have the chance to lead to superhuman coding AIs within two years.
Michaël: You said something about people not having tried RL. I would disagree. I think people know about RL.
Siméon: No, RL on coding.
Michaël: So something like-
Siméon: So something like AlphaCode, but with RL on top of the large language model.
Michaël: Something where you have self-play.
Siméon: Basically you define a metric which might be the length of your program times each solves the problem or something like that. You try to define the well specified reward and just make it iterate on programs very fast.
Michaël: Don’t you think people have already tried something about GitHub, like Microsoft has the entire GitHub data, don’t you think OpenAI is already trained on the entirety of GitHub?
Siméon: No. Sure. Even AlphaCode I think has been trained on GitHub. It’s just-
Michaël: I mean just the private also code in GitHub.
Siméon: It’s just they haven’t scaled as much models using GitHub. They haven’t tried to do RL on top of these models or maybe it doesn’t work, and that would be good news.
Michaël: Do you have any paper you’ve seen about using RL to build better coding models or is it mostly intuition for now?
Siméon: Mostly intuition for now, I think.
Michaël: We haven’t talked about 10%, so what’s your 10% chance of AI?
Siméon: Probably 2024 or something like that.
Michaël: Because you said something about 90% being certainty, but 10% is the opposite certainty so you’re like, I’m almost certain that it doesn’t happen before end of 2024.
Siméon: I’m almost certain here, but yeah.
Siméon: The cognitive biases that you need to include in your mental model of how you should predict things are not symmetric.
Michaël: So if I ask you what is the 90% chance of not having AGI?
Siméon: I’m tempted to say more like 2023, but I think it’s more 2024.
Michaël: So end of 2023 and beginning of 2024?
Siméon: Something like that. ⬆
Betting On Arrival Of AGI
Michaël: A year and a few months, hopefully after this podcast I’ll be able to bet at one to ten odds with a lot of people and maybe make some money. I hope not, because if you make money then it’s going to be like if you win, then…
Siméon: I wish I don’t win.
Michaël: Do you think there’s any way to extract money from having short timelines except just investing in stocks?
Siméon: Probably if you were very confident, if you’re 90% you would take huge loans and stuff like that.
Michaël: Well, you don’t need to be extra confident because you’re talking about something in the range of one or five years, loans are mortgages are like 30 years.
Siméon: You could probably also short certain stocks and long other stocks. So we’d probably stock short stocks that rely on labor a lot that sell labor for instance, or things like that. You could probably long stocks that involve compute, or…
Michaël: But maybe the stock that involve labor, they will only go down once you automate most jobs and this will take more time than the capacity of automating all jobs. So the time it takes from having some AGI to actually deploying this AGI to automate everyone is just like there’s a line, right?
Siméon: There are so strong incentives to be better that I don’t expect it to take ages. I just think that what might happen is there will be at first a few organizations that might implement AI managers or things like that and it will work much better. And so people will start implementing that than just those who don’t just fail.
Michaël: The other concept is Transformative AI. Do you think it’s basically the same timeline or is it a different concept?
Siméon: It’s a different concept, but I don’t think it helps a lot. I think it’s probably similar timelines. It’s also because it’s not clear what’s the effect of more of a lot more R&D on growth. It’s more fuzzy because you have this additional step in your model. I don’t like this concept that much.
Michaël: So you don’t think there’s additional growth from having better AI that could help you reshape the economy or having higher growth rate of GDP?
Siméon: It just pretty unclear because it depends a lot on your measure of inflation and when you start having a lot of R&D prices of everything will go down very fast. And so depending on how you measure inflation, you might not count a lot of growth.
Michaël: Isn’t there a measure that takes into account inflation?
Siméon: There is but it relies on what you count as inflation and what you don’t count as inflation, and that’s what is fuzzy. What’s progress and versus… What’s improvement of the product versus what’s just increase of price or decrease of price?
Michaël: If we just measure it as percent of our economy that is automated by AI, you think it’s a good definition, if it’s like 90% of our economies?
Siméon: I guess that’s probably much less fuzzy. ⬆
Power-Seeking AIs Are The Objects Of Concern
Michaël: And do you think this would be the same time as AGI or this would happen later because of the problems you mentioned about deploying the models?
Siméon: Probably later. That’s right. I also think that it’s a distraction for when you’re concerned about existential risk because just it’s not so good proxy. You might be very good at doing research, that’s Transformative AI, but being unable to take over the world just because you’re not general enough. So AlphaTensor, for instance, I think it’s some evidence that you can build models that are fairly good locally and that can have a significant impact, but not being general. And on the other hand you might have very general models that just are power seeking, but are not good enough yet to do some Einstein level research.
Michaël: So you’re mostly interested in artificial power seeking intelligence, so those models that could have goals to grab more power, grab more resources. And in your opinion, this is the thing that we should look after, not just general, but are there motivated to grab more resources or have some planning where they could just see the different paths to achieve a goal through different actions?
Siméon: I think I found that concept most useful to focus my attention on what most matters and when you try to think about this, you start to think about what are the minimal skills that are required to be power seeking. So you need be stronger than humanity if you want to be power seeking successfully at scale, you need to be good at planning, you need to be agentic somehow, so comfortable taking actions and executing them well…
Siméon: … Taking actions and executing them well, and being comfortable in a world out of distribution. There are some features that you can then think more about, and figure out when it happens. And I feel like that’s much better than thinking about the proxy, which is TAI, and then thinking what would be needed for this proxy and then forecasting on that.
Michaël: In your model, when we get to something like power seeking AI or something that’s reasoning,would we get by default, superintelligence and would we get something that just could read all Wikipedia and just copy itself into different servers and achieve some decisive strategic advantage and be impossible to control anymore?
Siméon: I think you mentioned a few things. Superintelligence. You probably get it not pretty quickly after. And there’s this other important feature which is just you can shape the architectures of your AI. And AI can have 30 gigabytes of RAM and so Wikipedia waits 20 gigabytes. So basically an AI can have Wikipedia in it’s RAM and if you just imagine a pretty smart human, if Wikipedia in his short term memory, he would probably be one of the best scientists. He could do some things that no one else has done and reason very well. I think that’s one thing which is likely to be this as if in the ability of AI to quickly become more powerful than humanity combined in a short amount of time.
Michaël: Do you think we would get to observe power seeking behavior as we make models more general or is power seeking some somehow orthogonal to-
Siméon: That’s a very good question.
Michaël: …I would guess like Eliezer said that as models get more general, they would have the same way of grabbing power that humans would.
Siméon: Unfortunately, power seeking strategies has an attraction basin, it’s optimal in that for any objective at some points, if you become capable enough it’s better to just get more power, get more energy, get more resources, self improve before exploiting. I expect that models will be good enough as they scale to find the best. They become better and better at finding the best strategies and thus they will stumble upon the power seeking strategies and be potentially able to execute it. ⬆
Scenarios & Probability Of Longer Timelines
Michaël: Other people were curious about your timelines and your short timelines, so now they have an answer, but they were also curious on Twitter about what would make you change your mind. So what do you think would make your timelines longer?
Siméon: Two main things, I guess. One, is if in the next two years the acceleration stopped and if most of the forecasts I did here, that is to say long term memory, longer context coding AI happened to be very wrong.
Michaël: How would you know if it’s very wrong?
Siméon: If actually coding stops scaling or if the best coding AIs are not very interesting for coders.
Michaël: How likely do you think that is?
Siméon: Probably 20%, something like this.
Michaël: So I would bet at 1 to 5 odds that in 2024, I would have a meeting with you and I would look at coding AIs and you will not say, “Oh, this is not very interesting.”
Siméon: So are you saying that it’s way less than 20%?
Siméon: Okay. I guess I accept this bet, but this number was a bit pulled out of…
Michaël: The quality then of the three predictions being wrong, do you require to have one of them being wrong to timelines or three of them?
Siméon: I guess the problem is they are pretty auto correlated. Maybe there is a reason I don’t see right now which makes that these probability, these events won’t happen. So I guess I’d say 10% probably that they be wrong at the same time, maybe 15 even. The other thing that made me-
Michaël: Wait, so you went from 20% to 50%?
Michaël: Oh 15, right, sorry.
Siméon: And the other thing which makes me very uncertain is if there’s a huge financial market crash or-
Michaël: Like we have now?
Siméon: But right now we haven’t had a major systemic bank which goes bankrupt, which almost happened early October with Credit Suisse, and if that happens, I think that could be rather very bad. And if it happened, that could delay timelines probably.
Michaël: By how much? If tomorrow there’s an actual bank going bankrupt.
Siméon: It’s hard to say because it depends on the financial crisis, but potentially one year, two years, I don’t know. If a huge bank goes bankrupt and there’s a very, very significant crash and VCs don’t have money anymore to put in scaling up.
Michaël: Because was the covid, let’s say recession somehow in 2020, I don’t know if it was a recession, but there was…
Siméon: A very small crash.
Michaël: Very small and it went up again. Did that delay timelines a little bit or?
Siméon: No, I think Covid delayed timelines a bit. There were supply chains problem with compute. And the second thing, if there’s World War III or something else like this, which would be obviously terrible. Then people may stop caring about AGI.
Michaël: Are you secretly vouching for a World War III?
Siméon: No, I’m not.
Michaël: I tried to make you say something controversial. Another question we had in Twitter was how did you form your views about alignment and timelines? How did you shape your reasoning to get where you are today?
Siméon: Usually the way I form inside views, is first I get into a field. I try to get some information usually by reading. And then I form insights and try to make hypothesis and then I talk to someone and then correct these hypothesis and update based on the new information and feedback I get. And then I repeat, so I reread things and generate new insights and talk to someone. So on alignment, I did that with someone, a PhD student from UC Berkeley who was very well read on the alignment literature and that was great because what’s cool is once you reach the state of the art of the debate, you have a pretty detailed inside view and you are still able to generate new insights on what’s plausible in this debate.
Michaël: So investigating, it’s like you read about AI alignment or the state of the art and then you try to come up with the next steps or some questions you have about those problems and you have this other person criticize your thoughts and be like, “Oh, this is wrong.” And then you update and you read stuff again.
Siméon: Rather than a question, it’s more of a hypothesis.
Michaël: And right now you’re doing the same on Twitter where you just make tweets and if people criticize your tweets you’re just update directly.
Siméon: There are areas where I’ve heard things that I’m not a hundred percent sure about. And then update based on what people say. ⬆
Michaël: One way in which the difference between having long timelines or short timelines will make a difference is for AI regulations. Originally AI governance plans could take much longer. So what are plans you consider that you think could only work if long timelines are true? Which I think you mean more than 2035 or more than 2040.
Siméon: What’s hard about long timelines is that governments are probably in the race. Coordination between corporations is almost impossible because there are probably many corporations that are racing. China is much more likely to be leading. So these are I think some of the features of long timelines that are worth having in mind and thus, the challenge is trying to make sure that China cares about safety. That’s the right tough thing because doing anything in China is not trivial, even more now in the current climate. The second thing is governments are in the race and the very hard thing about governments is that basically they don’t have the same language as alignment people. They have a very, very different ontology and so it’s very hard to explain then why people are worried by this and it’s a very tough challenge. Also governance have a very bad track record at coordinating with other governments and so just very hard to solve this problem.
Michaël: So do we need to have governments cooperate with each other or just implement regulations at the international level or at least have the key players, let’s say US, Europe, China, Russia, to agree on basic things?
Siméon: Not basic things, but we have to get them to agree on things if you want to avoid the strong race.
Michaël: What things? Like please do not build general systems?
Siméon: “Please implement XYZ safety mechanism on your hardware or please red team your model a lot. Please do interpretability when you’re training your model.”
Michaël: And don’t you think there’s a chance of them good-harting those measures? So just doing the minimum. So let’s say you mentioned that you were into climate change before. Have we seen any benefits from this COP 21 or those things where China tried to implement long term plan of reducing CO2 emissions or did they just do the minimum and not much happen?
Siméon: On average, I expect most people to do the minimum, but still the minimum is very positive for COP 21, for instance.
Michaël: Does it put us into a position where we don’t have terrible consequences from climate change in 20 years?
Siméon: Okay, that’s a whole other debate, but it depends on technological progress. Many more governments are caring now and that’s like they did significant investments that increases likelihood that in 20 years the world be still okay with regards to climate change.
Michaël: But it took them 20 years or more to agree on those things.
Siméon: That’s what is very difficult with long timelines is that you have to solve an unusually hard coordination problem.
Michaël: Wait, it’s hard in the case of short timelines, but it’s also hard in the case of long timelines because even if timelines are thirty years long, need to solve this problem.
Siméon: I think governance is easier in the short timelines than the long timelines.
Michaël: I don’t get it.
Siméon: I think that short time lines, the governance actors that matter are corporate actors. And what’s cool about these people is that they have a much closer ontology than you have. And so you can talk to them and potentially share your model of why alignment is difficult. And you have basically four or five top labs that are most likely to develop AGI and so you just have to coordinate these five actors. They can move fast because they’re corporate, they understand your language, you can talk to them,. So that’s why I think that it’s slightly easier than having five major governments with millions of people inside it. They don’t understand your language, they hate each other.
Michaël: In some way, having shorter timelines is good to reach some cooperation and maybe some international treaties or between different actors to reach some agreements, but it’ll lead to some much harder technical problems because we want to have a lot of time to solve alignment. So maybe something like, “Hey, let’s just not build AGI,” and there are five actors. It seems like in our community there’s a clear divide between people with longer timelines and shorter timelines. Do you think people with longer timelines are just wrong and have just not thought about all the arguments you mentioned before?
Siméon: I wouldn’t frame it that way. I think they’re wrong, otherwise I wouldn’t have the belief I have, but I still said that I updated a bit on the fact that I know very smart people with who I often talk that have much longer timeless than I do. I think the main quirks is usually how much you give weight on current progress and gain a very inside view of things. Should say this capability might lead to that. And so if we have a capability that makes timeline pretty short versus people who are more like let’s look at compute, let’s look at the general trends. That seems much further than in five years or something like that. I think that the fact that at least most of the funders that are interested in alignment have long timelines is fairly bad because basically there are nontrivial trends that will die because of that. Basically because as I said, the optimization under long timelines and short timelines are not exactly the same.
Michaël: When you mentioned optimization is more like trying to have the highest impacts with your career or taking the most impactful actions given possible outcomes.
Siméon: I mean as a funder, what should you fund to maximize your impact?
Michaël: And on a personal level, if you’re trying to have a career in a governance and maybe become a policymaker or influence policymakers, is there any career plan where you can have both an impact under short timelines and long timelines? Can you take your camp and just play the 20 years game?
Siméon: Definitely. So I think it’s not available for everyone, but some interventions I’m excited about are cultural interventions and what’s cool about the policy-world is that currently is the only one which is thinking about risks. So there’s the NIST risk management framework, which is in the risk management framework, which is being developed for AI. And there is the EU AI act, which is a regulation act to regulate AI. These two lead to some debate around risks and that will probably have a lasting impact on corporate governance as well and how people think about risks. And I think that as a policy person or governance person, you can both build carry capital and be in a better place to be in a good position in the government and have this cultural fear of change where if I talk correctly about risks right now, if I mention power seeking risks and these slightly longer term risk than what most people are worried about, I might actually have an impact on generally the way people think about the problem and thus, potentially on corporate governance people. And this about what actions AGI labs will take if at some point there are problems.
Michaël: If the timelines are as short as you mentioned before, pushing for some EU AI might just distract us from the actual risk and maybe governments will have a wrong understanding of what are the risks associated with AI not understand the existential threats. And if we try to convince them later from some more existential risk, they’ll be like, “Oh no, I don’t understand this. People are concerned about other things.” So maybe just everything is a fluke and you need to focus on the actual message you want to pass. If time are three or five years, maybe you just want to go and talk to Joe Biden about it and you’re like, “Hey, have you heard of AGI? I don’t know, maybe there are two camps. Maybe the longer your timeliness are and the more you should envision some smoother transition from EU AI act to AGI and if timelines are very short, maybe just go abruptly.
Siméon: I’d say that any way, abrupt messaging usually doesn’t work. I think there are definitely potentially different framings according to how worried you are on the short term, but I still think that without losing too much social credits, which you would obviously lose if you were like, “AGI,” you can still do a good job at giving people intuitions why advanced systems might be bad. ⬆
Compute Governance Seems Relatively Feasible
Michaël: So you mentioned compute governance and I think this is some promising path to impact, at least in effective altruism or people thinking about how to regulate AI. What is compute governance and why is it so important?
Siméon: Compute governance is this idea that if compute is so important in the AI progress, maybe you can leverage compute hardware as a means to cause actors to implement safety measures. And maybe putting frameworks around how computer is distributed, you might push more responsible behaviors. And why it’s so important is because compute is unusually good to do that because currently the compute supply chain is very monopolistic. So at several nodes in the compute supply chain, you have one big actor which has most of the market. And why is that good? It’s because if you want to force downstream actors to implement safety measures, the best way to do that is to be a monopoly. Because if you’re not a monopoly and you put some measures that are constraining for the downstream actors, then your competition will just say, “We don’t do that, you can come with us and you’ll just move faster.” So what’s good is that in the computer supply chain right now, there’s ASML, which is the single company which is producing the machines that allow you to produce very small chips.
Michaël: It’s a company based in Netherlands?
Siméon: Netherlands. Right.
Michaël: So do manufacturers in China use products from ASML?
Siméon: Definitely. It was even the biggest customer for ASML.
Michaël: So it’s like they build machines to build other machines?
Siméon: Potentially you could leverage that debate. And then you have TSMC also, which is assembling the chips, which is like 60% of the global market and then there’s Nvidia, which is pretty dominant in high end GPU’s market. And all these nodes potentially if they wanted to, could suggest safety measures and could have a decent chance of getting them accepted because their customers need their product. And so there could be potentially here public-private partnership where some public actors push these actors to try to make your customers safer and limit the risk, leveraging your market power.
Michaël: So who would tell TSMC to only give compute to certain actors? Who would be the influence? Would be other, some US government or I guess Chinese government? Who would actually have the power to tell them what to do?
Siméon: You often need actors everywhere. So I would expect potentially one big company which is caring like OpenAI wants to show it cares and it actually cares more than most other labs. And you could imagine having OpenAI governance team saying, “Yeah, we should probably put switch that allows to turn off a GPU on GPUs.”
Michaël: Do you think it’s realistic for them to accept those things?
Siméon: There’s something like 30 to 40 percent chance. It’d be realistic.
Michaël: I disagree. I think it’s lower than this. I don’t see. I have a hard time imagining a world where TSMC is like, “Yeah, I’m going to put some off switch on my own GPUs.”
Siméon: TSMC I think is not the most likely, but ASML or Nvidia might be more likely. I expect that the world in which we succeed is a world where we have one big company, some researchers from academia, one country which is on board and probably some non-profits around that are pushing for that.
Michaël: At what point do we start producing chips with off switch remote shutdown features if we want to convince them before 2025, this whole negotiation should happen by 2023 and maybe we get those ships produced by late 2024? In the best case.
Siméon: I think the switch for instance is probably not too hard to implement. If you’re back onto 2025 timelines, we’d need to move very fast on that, which is not completely implausible, but quite implausible.
Michaël: Something people are excited about when you mention governance is not only thinking about this all supply chain and try to have them to have some requirements on the hardware, but also some requirements on the AI part, ask models to be more interpretable, to have some metrics that are shown to show that they’re doing the right thing. Are you bullish on just asking Google Brain to have some interpretation measures for any model that requires more than 10 to the 15 flops or something?
Siméon: That’s definitely something I’m quite bullish on. So as you said, you define probably a level of compute and above that it could become a norm to run some additional safety tests compared to what we do. And I think once again, the version of that I’m most optimistic about is a version in which people are doing this voluntarily just because some very high status organizations started doing it and implemented it as a norm. One anticident, one example of that is DeepMind study doing their section seven on ethical impacts and bias.
Michaël: What’s the section seven?
Siméon: So in their paper, their new papers on state of that model, they started putting systematically a last section on bias and ethical harms of these models. And just after that, a few other organizations also put similar sections at the end of their papers.
Michaël: Isn’t it a distraction from actual alignment concerns?
Siméon: I won’t dive into that. I think if I weren’t worried about alignment, I would be very worried about this. So just as a weight, I’m more worried about alignment, but I think it’s still a fan of even when you don’t buy alignment. The point I was making is more you top organizations have potentially an ability to implement norms and these norms can be used by other actors as well. If some organization started saying for state of the art models we always run interpretibility checks and red teaming before deploying them, that could be followed by other organizations.
Michaël: And I think it’s a good stepping stone towards more alignment measures. Maybe at the end we could have a section eight about alignment.
Michaël: Is compute a good metric for governance when new models can become more efficient and use less compute? So maybe something like AlphaTensor were to use less flop for the same matrix multiplication. If we say your model cannot use more than 10 to the 20 flops and at the end we discover a new architecture that is capable of running on 10 to 15 flop and they just have AGI who’s out of compute requirement?
Siméon: I think compute will keep playing a role. You can probably always do more with more compute, but it might not become the main driver.
Michaël: It seems like with scaling laws, people are finding ways to scale more efficiently or it is a compute optimal way of scaling and there been new discoveries this year, so I wouldn’t be surprised if we discover better ways of scaling in the next few years.
Siméon: I agree, but still you can… I agree that it may not become the main way of scaling more, but I just think that more compute will often lead to more capabilities, even if it’s not the first order mechanism anymore. ⬆
The Recent Ban On Chips Export To China
Michaël: So recently we’ve had some ban of, I think it’s exports of GPUs, so Nvidia cannot export GPUs to China?
Siméon: It’s a bunch of things. Banning of some critical semiconductor equipments and American employees are forbidden to work on anything related to semiconductor supply chains in China.
Michaël: So do you think Joe Biden gets the compute governance game and just trying to limit all the action from China?
Siméon: I guess he’s not thinking that way, but he’s probably thinking about compute seems important. China is struggling to reach our level, so let’s kill its ability to keep up with current technologies, just killing its semiconductor industry.
Michaël: Isn’t it like a risk in the long term as it’ll increase tension between different countries?
Siméon: I’ve only read about it a little bit. It looks like it’s a really huge move and people tend to say it will have very long lasting consequences. Some people were hypothesizing that it’ll strongly increase the pace at which China will be willing to invade Taiwan.
Michaël: What do you think is the chance of China invading Taiwan?
Siméon: I’m like probably 20% by 10 years or something.
Michaël: So pretty low.
Siméon: Pretty low, but I didn’t really update on that, I didn’t take the time to update on that.
Michaël: So this would be in the case where they’re not able to catch up with high end GPUs hardware, like H100. Do you think there’s a way in which they could still get H100 from the other distributors or build their own or maybe just buy other hardware? I think Huawei has SN110 that is comparable to maybe older GPUs?
Siméon: My understanding is that it will be very hard because of the properties I mentioned. That is to say that ASML is a monopoly and it’s based in Europe. Nvidia is in the US. TSMC, it’s in Taiwan. And so I would be very surprised if the cheap, like the GPU you mentioned from Huawei required none of these components and basically it necessarily requires ASML machines at some point.
Michaël: What’s the incentive for ASML to agree on all the US measures?
Siméon: They have no incentive.
Michaël: Agree on all the US measures.
Siméon: They have no incentives. If you look at the story around this, there’s always like ASML CEO, “Please can we sell to China” and the US knocking at their doors every six months. “What’s the deal with China? Are you trying to sell to them?” Like, “No. We don’t do anything.” Over and over again. The US is very caring about this. They really put a lot of pressure on ASML. Like ASML is trying to do things but they can’t.
Michaël: So is the idea that if ASML doesn’t agree, then the US would stop buying stuff from ASML?
Siméon: The chain of command is probably Netherlands willing to keep very good relations with the US and the US putting a lot of pressure on the Netherlands to put a lot of pressure on ASML, and this ASML being dependent on Netherlands, not willing to disobey.
Michaël: Right, so they’re dependent on the Netherlands and not the US?
Michaël: Then the question is about pressure balance between China putting pressure on ASML versus the US. So if China demand increases more than the US demand and they say, “Hey, we can pay for twice as much.”
Siméon: But I think they won’t be able to have a lot of bargaining power there. ASML is currently very unlikely to move on China. On the China side, the current state of affairs is that ASML is allowed to sell machines that are three generation old. Still China is their biggest customer and that’s why they were very worried to lose this customer. I think they are like 50% of their revenue or something like that.
Michaël: How can they accept to lose 50% of their revenue? How is it even economically possible?
Siméon: I think currently they don’t lose these revenues, but at some point they might, but they are very profitable. They do huge margins because they’re in the monopoly.
Michaël: They don’t lose all the revenue because they’re still selling the older versions?
Siméon: That’s right. ⬆
AI Governance & Fieldbuilding Were Very Neglected
Michaël: Since those new measures from the US, people have been getting more excited about compute governance and AI governance in general. The other intervention we discussed as well is doing AI alignment research to make sure that our systems are more aligned and those governance projects are just buying us more time or making sure the correct actors are in the lead. Other interventions include having more people interested in AI alignment. Maybe it’s through watching some podcasts or some universities and you’ve decided to do the community building or just spreading AI alignment ideas through auditing or the bootcamps. Why you think is this is the most high value thing to do and why do you decide to do it?
Siméon: Governance and more fieldbuilding. Both were very neglected, I think. We haven’t tried hard to share the arguments in an understandable way, convincing way. And so I’m just very excited. I think we should try much harder to share this idea with existing actors and also bring more people in the field and ensure that people who are likely to work in deep learning have heard about this, have taken it seriously. So that’s why I’m working in these two areas.
Michaël: What would it look like if people in deep learning took this seriously, would they change their career because one can buy the arguments but not want to change their career. There are not enough jobs or the market is not efficient in a way where you can just quit my job and get the same pay at some alignment organization.
Siméon: I think 50% of the people who are into our bootcamp will likely work in your work related to alignment.
Michaël: I think that’s very optimistic.
Siméon: I don’t think it is. I just think it’s will strongly decrease with the marginal bootcamp because here we just pick the low hanging fruits of people who are both very altruistic, very motivated by the topic and good enough in maths and CS to go in alignment. We had 25 people in the bootcamp and I think that about 12 of them will likely work in alignment. I think that will be decreasing marginal returns. So the next bootcamps will be less promising that this one. But I just found that people who are curious are happy to engage with the topic and understand what might be right, what might be wrong. And some people especially who go in research are thinking deeply about how to do good. And so if you provide them with some ways that seem to be both interesting and impactful to work on, they will probably take it.
Michaël: So is the idea that we just need to give them opportunities in terms of learning and credentials and money?
Siméon: Yes, definitely. So why people go to DeepMind? Because it’s exciting, because there’s money, there’s a good work environment and there are smart people over there.
Michaël: I think it’s pretty hard to achieve all those three goals at the same time.
Siméon: I think it’s feasible. We could have the money. Alignment problems are fairly interesting.
Michaël: I guess the problem is that there’s no revenue. I think the problem is that there’s no way of generating revenue through alignment.
Siméon: But DeepMind wanted to become a nonprofit. DeepMind is not optimizing for revenue. You can have vendors and there are vendors who pay a lot of money.
Michaël: But it’ll be less than the AGI companies that show that they can build products that would have a huge impact. So the funding situation if you do a nonprofit would be much worse than if you’re just creating AGI.
Siméon: I agree that it’s not as good but DeepMind is the biggest lab by far and its expenses are 1 billion a year. If you only take people who self-identify as EA, we could already have 1 billion a year. And if you take additional people who may be interested, you can probably reach more.
Michaël: Maybe the idea is that they’ve proven that they could be useful for Google and they’re part of Google and they’ve done this work on AlphaFold and other things where they’ve proven to be profitable in some way. If they were just doing weird AGI research that had no impact in the world whatsoever, they would not be able to spend a billion.
Siméon: Not profitable, but they have still 700 million net loss or something like that. But they are expected to become useful in the long run. But what I mean is that if you care about what the foundations and big funds are optimizing for getting as much impact as possible. And so if you optimize for that, why couldn’t you spend something of the size of the magnitude of what Google is spending on DeepMind if you had the talent.
Michaël: So do you imagine SBF investing one billion in these new alignment companies that would just recruit some high-end deep learning engineers?
Siméon: I would love that. He does that pretty quickly. You can do that overnight, but if you give a hundred million to a company and they do well, then you can give a bit.
Michaël: Are you creating this company?
Siméon: I’m not, but I’m trying to meet people and potentially to introduce people who could potentially build a huge AI safety organization which would be relevant. ⬆
Students Are More Likely To Change Their Minds About Things
Michaël: You expect that we need something like 10,000 people working on alignment so that we can be productive and have a decent chance of solving it. And so that’s one of the things you’re trying to achieve. But maybe, I guess my take would be that you’re too optimistic about our ability to reach something like 10,000 people and that we’ve only scaled from a hundred people working on alignment to 300 in let’s say three or five years. And this is much lower than the rate of growth in AI in general. So aren’t you just biased in what you do and you just think it’s possible because you’re doing it and maybe just not tractable?
Siméon: Look my question is how hard have you tried to get people in the field? Currently most people arrive by literally they had to do marathon, they first dropped out of school, that’s the first step. And then they self studied for a few years and then they were reading some obscure blog post on LessWrong. And then if very good enough they find some random grants somewhere and start working on it on their own. That’s the typical path. If you create pipelines where people are happy to go in that direction because they feel it’s meaningful and they feel like there’s not too much risk going that direction. You can just be as competitive as DeepMind and even potentially more because DeepMind having enough resources to spend a lot on upskilling people.
Michaël: But that would require people to care about saving the world or be altruistic or just simply buy the arguments. So there are many probabilities that you multiply when you consider people going to the directions. And from my experience talking to ML researchers, even if you send them the correct resources, there’s some chance that they will not agree with the arguments or some premises and just being have a prior and optimism that is different from you, maybe their prior and things going well by default is just much higher. I would say that maybe this prior and optimism is something you cannot really change.
Siméon: So I think here we’re talking about two different things. One is students and students are very open minded because they don’t have their identity which is tied to a job or to certain things they have defended for years.
Michaël: So trying to brainwash students because they’re the most easy to influence?
Siméon: Not brainwash, just they will be exposed by default to some arguments, you wish they were exposed to the best arguments if you think something is the most important and so you just expose them to that. If they think it’s important then you help them entering the field and then there are more senior researchers and there it’s much harder because imagine I was coming and telling you Michael, what you’ve been doing for the last five years was extremely negative and meaningless. Now we need to work on that. That’s difficult to buy. So you need to have an unusually open mind to update and say, “Yes, oh yes, I’m a very bad person, let’s change what I’m doing”.
Michaël: I would argue that those people are the one that are most likely to have an impact. You want to have people to work on AI alignment theory or at least algorithms and if you want to make progress, I don’t think a couple of fresh grad students will be able to solve anything. And if you can convince the key players with the most experience, people capable of training large models, I don’t think you’re going to achieve anything. Conditional short timelines, I don’t think doing any community building actually helps with anything.
Siméon: I think in five years you can definitely do useful things, even more in eight years because you can probably create pipelines that take two years between someone enetering and someone being able to contribute meaningfully. The point I was making is just why it’s way easier to talk with young people and change their mind about something, especially when it involves potentially their future job.
Siméon: So on the senior researchers, I think it’s my view is that there is distribution of people who are more or less open to changing their mind. And I expect basically 20% of people or maybe a bit more to be able to change their mind meaningfully. And I think that for each additional person that you get, you also get some people who just are vibe following, are following what seems to be trendy or what others are thinking. And I think that each marginal percent of people you get increases more than linearly the number of people you might get interested who defer or just thinking improperly about these topics.
Michaël: So it’s the idea that basically if you start with the people will change your minds quickly, that can reason from first principles, then you will have other people are just following and those people will come if you have a critical mass of people, similar to what happened with climate change, some theories predict that when most people have heard of these arguments and the elite or the people were actually think carefully about things had the same belief about climate change, then many people started following the movement because it was trendy.
Michaël: So you mentioned that half of the people in some of your bootcamps were thinking about changing their career towards more alignment-ish jobs.
Siméon: More we’re thinking about this but I expect just half to actually end up doing it.
Michaël: How did you measure it? Did you just give a feedback and be like, Hey, how likely are you to just change your career?
Siméon: I guess the best proxy seems to be after the bootcamp how motivated they were in keep doing projects on alignment and keep upskilling. And currently there are about 15 of them who are upskilling, 16 probably. And I think that among these most will end up doing something in the field.
Michaël: You think there’s some simple bias problem where, as you said it was only 25 people and this doesn’t scale because it just looks like very altruistic people?
Siméon: Definitely. And that’s why I think it’s way more promising to do one such a bootcamp in most countries than 10 such bootcamps in one country because there are strong decreasing marginal on returns. But I think that basically in every new place you do this bootcamp, you’ll probably find some people who are both very motivated, very good and just willing to update on some arguments.
Michaël: Do we just copy a new Siméon Campos in different country with perfect voice transcription for the new language or do we just train new people to do it?
Siméon: The plan is to train new people to do it and to share the recipe. ⬆
Bootcamps Are Better Medium Of Outreach
Michaël: I would say that it’s not as easy as you to think to convince people of AI alignment. Robert Miles on YouTube have explained all the core concepts to millions of people. Yet we’ve only reached something like 300 people working on the problem at best. Don’t you think it would be way harder for you to explain the arguments than Robert Miles? It should be harder to do a better job than Robert Miles on explaining alignment concepts?
Siméon: Okay, so two things. First, there are a bunch confounders. Basically, Robert Miles is one very little piece in the pipeline. So by default you can’t expect someone who does outreach to naturally create new people. So measuring the number of alignment researchers is not a good metric to measure the impact of Robert Miles. You need that you could be more pessimistic if Robert Miles had tried hard to create bootcamps and create upskilling programs after people viewed his video in order to become alignment.
Michaël: So I guess my claim is that if people were able to change their minds on talks or conferences or bootcamps, then if they had the best five minutes explanation of a concept or 10 minutes explanation of concept, they would update. But because his videos have so many views and they’re the best explanation of some concepts of alignment and we have so little people are working on this, I would expect that it’s hard for people to update and if you give some talks or conferences, it’s going to be not the best explanation because you will have to do it live to prepare your talk and give it in front of people. So it would not be the best explanation of a topic.
Siméon: I think there are a few things here. First I think that these bootcamps provide people with 24 hours of content, which you can’t even reach if you watch every single videos from Rob Miles. They also provide you with someone who answers your question. So for instance, there were some people who were thinking from first principles in the bootcamp and had a lot of concerns and I was able to take their concerns seriously, show why it was a good intuition, and then answer why I thought differently. And last, I think that I’m generally fairly good at explaining things to specific people because I know how to adapt myself very well to context and I expect that this plays a bit in the returns of this bootcamp because for instance, there was someone who was very skeptical and the default when a teacher explained to someone who is very skeptical is to be very dismissive of his claims. I think that’s a very best strategy. You need to show that you’re fairly good, that you understand his claims, that potentially they are good, but still there are reasons to think differently and there are a few things that people are on average not very good at and I feel like I’m slightly better than average.
Michaël: So there’s a way of saying things to people that require being in person and saying that they’re correct to have certain beliefs, but there is some bias in their reasoning or there’s a way in which they could think better. And so there’s a way of talking to people that you think is better in person.
Siméon: Basically. And I think the rationale behind this is just if a random guy comes to you and tells you you will die within 10 years for something you have never heard about, it’s normal that you have a very low prior. And so you have some objections that are more or less good objections and explaining why they’re good intuitively given the level of knowledge of the person is a good way to build a trust relationship because I first came through these steps before buying the alignment problem. And yeah, it’s just accepting that people may have these objections and not being dismissive with regards to them. And that’s mostly allowed I think by one-to-one interaction and by setting attention to these things.
Michaël: So you just come and say, “Oh yeah, I’ve been here before. I know this sounds weird”. And you show your higher status as a student from one of the most prestigious university in France and you’re like, “I’m the same as you guys, but here’s a cool argument”.
Siméon: Let’s be more concrete because here it’s a bit too abstract. For instance, there was a person who was really strongly reasoning from first principles and she was very skeptical about inner alignment and the default behavior of some other teachers of the bootcamp was to be very dismissive. ‘No, you don’t understand because you haven’t learned enough.’ Actually what I emphasized is my uncertainties about innner alignment and why it was very fuzzy. And then I still said I still think that this asymptotic trend of planning on the long term is optimal for a maximizer, led me to believe that at some point we would reach that point of mesa-optimization. But just first explaining why the person’s intuition is totally normal and is actually a pretty good intuition. And then saying why I’m convinced otherwise is important in building trust with smart people. Because smart people don’t want to defer. They want to have a first principle understanding and so if you just tell them, “No, you’re wrong, you’ll just learn more after”. They can’t update the model based on what you say.
Michaël: But this doesn’t really scale because you’re able to talk to people in France and you spend a few years learning about alignment. If those people learn about it for a few months, would they be able to just teach it to other people? How long would the training take for them to be able to teach other people?
Siméon: I think motivated and smart people learn in a few months at least all the theoretical-
Michaël: Is there any other Siméon that was able to learn it in a few months?
Siméon: I didn’t learn it in a few months, but I know some very motivated people from the bootcamp actually who are very rapidly upskilling on everything. ⬆
Michaël: Do you have any other last thought to share to people about what you’re doing or why they should care about short timelines? Like a final word for YouTube, Twitter, the outside world, internet?
Siméon: What you should really strongly consider is that alignment is very general problem where you both need to solve the technical and the governance problem. And basically whatever your discipline or subject is, you’re quite likely to be able to contribute if you’re good in your subject. So for instance, operations people are scarce, they’re important for most organizations, so they matter a lot. Communication currently, everyone is bad at communication alignment. Most, posts are not really shareable, not very knowledgeable. So these are important skills. Even biology might matter. So for instance, some people are such as like Buck Shlegeris are interested in evolution as a case study for optimization, algorithms. If you care about alignment or governance and you want to solve it, don’t think that by default you can’t. If you are in a not very intuitive areas, I’d be happy to chat with you if you wanted to about potentially what could be a good fit for your skill set.
Michaël: And what’s the best way to reach you?
Siméon: Probably an email, email@example.com
Michaël: Send a lot of love to Simeon. He’s been here for many hours and it’s probably one of the most impactful person that takes actions for short timelines. It was pleasure to have another French person on the podcast. I hope you do many good things.
Siméon: Thank you Michael for having me.