2022-08-25

Robert Long on Artificial Sentience

Robert Long is a research fellow at the Future of Humanity Institute. His work is at the intersection of the philosophy of AI Safety and consciousness of AI. He has done his PhD at NYU, advised by David Chalmers.

We talk about the recent LaMDA controversy (see Robert’s summary), Ilya Sutskever’s slightly conscious tweet, the metaphysics and philosophy of consciousness, artificial sentience, and how a future filled with digital minds could get really weird.

^{_{(This is a long transcript, you can click on any sub-topic of your liking in the outline below and then come back to the outline by clicking on the green arrow}} ⬆^₎

Discussion

Language Models Are Slightly Conscious
The Philosophy of Consciousness
Towards A Science Of Consciousness
Digital Minds
Conclusion
- Further Readings
- Footnotes

Language Models Are Slightly Conscious

The LaMDA Controversy

Michaël: In the past couple of months, I’ve seen all of your tweets on my timeline with this whole LaMDA Blake Lemoine debate. And I think it would make sense to just start with that. So for our listeners that have lived under a rock for a few months and don’t know anything about the whole situation, how would you summarize it?

Robert: So there’s this big news story a couple months ago, it was about a Google engineer called Blake Lemoine. He was on the responsible AI team at Google. And I guess in late 2021, he had started interacting with a chatbot system called LaMDA. I think he was supposed to interact with it to test it for bias and things like that. But in talking to it, he got the impression that he was talking to a sentient being that needed his protection. He just took this very seriously. In one of the interviews, maybe his Medium post, he has this great line where he’s like, “After I realized this thing was sentient, I got really drunk for a few days.” He’s like, “I walked away from the computer and just got wasted and then I came back ready for action.” ¹ It’s something like that. I hope I’m not misconstructing it too much.

Michaël: So it was too much for him, so he decided to have some drinks.

Robert: Yeah, it’s relatable. And then what he decided that he needed to do was, I guess, raise flags about this within Google. So at various points, I’m not sure exactly on the timeline, he shared transcripts of his conversations with LaMDA to internal Google representatives. He went to his higher ups to talk about it. At a certain point, he brought a lawyer to talk to LaMDA because he’d asked LaMDA, “Would you like a lawyer?” And LaMDA was like, “Yes. Yes, I would.”

Michaël: And the lawyer accepted that.

Robert: I can’t remember exactly what happened with the lawyer dialogue. I do know that in the Washington Post story about this, he also talked to a journalist, which is how this eventually broke, he brought the Washington Post journalist in to talk to LaMDA. And when he did, LaMDA did not say that it was sentient. Because as we’ll discuss, what LaMDA’s going to say about his sentience is going to depend a lot on exactly what the input is. And I think Blake Lemoine’s explanation of this is that the journalist wasn’t talking to it in the right way. It was disrespecting LaMDA.

Robert: And so of course, LaMDA’s not going to act like it’s sentient with you. In any case, from interacting with this thing, he thought that there was just a serious moral issue going on. I think, to his credit, given that’s what he thought, he tried to raise the alarm and blow some whistles. And Google did not like this. I think his higher ups told him, “Look, there’s no evidence this thing is sentient.” And yeah, he got put on leave and then he actually got fully fired last month, so he is no longer at Google. So yeah, that was the story of Blake Lemoine and LaMDA. He has a Medium, a Medium blog, so you can read his takes on these things. He’s also on Twitter.

Michaël: In one of your tweets, you said the Washington Post article conflates different questions about AI, understanding, sentience, personhood. What was the criticism you had about this whole story and how it was depicted in the press?

Robert: So I think when people talked about LaMDA, they would talk about a lot of very important questions that we can ask about large language models, but they would talk about them as a package deal. So one question is, do they understand language? And in what sense do they really understand language? Another’s like, how intelligent are they? Do they actually understand the real world? Are they a path to AGI? Those are all important questions, somewhat related. Then there are questions like, can it feel pain or pleasure? Or does it have experiences? And do we need to protect it? I think Lemoine himself just believed a bunch of things like not only is it conscious, but also it has significant real world understanding. At one point, he said, “I know a person when I talk to one, and this is a person.” At one point, LaMDA refers to having a family. I don’t know if you saw that.

Michaël: I think he said he considered LaMDA one of his colleagues.

Robert: Yeah, exactly. Well, I think on a variety of these issues, Lemoine is just going way past the evidence. But also, you could conceivably think, and I think we could have AI systems that don’t have very good real world understanding or aren’t that good at language, but which are sentient in the sense of being able to feel pleasure or pain. And so at least conceptually bundling these questions together, I think, is a really bad idea. And the way the debate went to my eye is because there were already these existing debates about large language models and what they understand and if they’re path to AGI and… is scaling all you need, things like that, people just sort of gloomed this debate onto that, which I think people should not do. Because if we keep doing that, we could make serious conceptual mistakes if we think that all these questions come and go together.

Michaël: Maybe people conflate consciousness and human-level AI. Could be Artificial General Intelligence. And there’s the whole debate of, could we do big, large language models that reach AGI? And I guess the people were very skeptic of this thesis thing, that there’s no way we can just have a large model that is human level, that is conscious. I guess people are mixing… everything’s very messy. But saying this is sentient was the cherry on top that people were very angry at.

Robert: I think a lot of people, and understandably if you view a lot of discussion of large language models as hype… I can see why. If you’re already sick of all the hype about LLMs and then you hear that people think that they’re conscious, I can totally understand why people are like, “Oh my gosh. Will people just please shut up about large language models? They’re just pattern matching,” in this view. ⬆

Defining AGI And Consciousness

Michaël: I think it would make sense to just define quickly a few of those terms. So let’s start with the easiest one that people know about, Artificial General Intelligence.

Robert: [Joking] That’s very easy. Well, lots of different definitions. I guess one that people use a lot is an AGI would be able to do a range of different tasks and be able to learn a range of different tasks. Maybe that range is the range of tasks that humans could do or some subset of cognitive tasks that humans could do. And yeah, so in contrast with today’s narrow AIs, which can maybe only play go or only do dialogue, an AGI would be something that could do a variety of things and learn flexibly to do a variety of things.

Michaël: And I think the main difference would be that it would be aware of its own existence. So it would be able to reason about itself and its impact on the world. It’s able to act in the world and plan, it would say, “Oh, I’m an agent and I’m able to do these kind of things and have an impact.” And in that sense, I think we can conflate the intelligence with the actual, I’ll consciousness but I feel like “aware” and “conscious” are different. But yeah, “aware” is “I’m aware that exists”. How would you define consciousness?

Robert: So I do want to flag that you just outlined a way that consciousness and intelligence might tend to go together or be related. But first maybe, I’ll point out how they’re at least conceptually distinct and I think can probably come apart in the animal world. Consciousness, huge word, gets used in a ton of different ways. The way I’m going to be using it, which is not the only way you can use it, is just something’s conscious if it has subjective experiences, if there’s something it’s like to be that system. This is from a famous article by Thomas Nagel called, What Is It Like to Be a Bat?. He’s famous for introducing this way of picking out the phenomenon of consciousness in just the bare sense of subjective experience. So on this view consciousness is not the same thing as being really smart or being able to take all kinds of interesting actions. Something’s conscious just if there’s something it’s like to be it. So there’s probably nothing that it’s like to be this can of Red Bull, but there probably is something it’s like to be a dog.

Michaël: What is it like to be Rob Long? What is it like to be LaMDA?

Robert: What is it like to be Rob Long? Well, right now, it’s very pleasant. This is a great podcast. I’m very honored. This is actually a good exercise in introducing some phenomenal concepts. But yeah, there’s something it’s like to be experiencing this visual field. There’s something it’s like to see blue. There’s something it’s like to see this bright light. There’s something it’s like to feel like I’m sitting in this seat. So my brain’s doing a much of information processing, and some of it is conscious and it feels like something to be doing that. In contrast, a rock tumbling down a hill is doing a lot of stuff, but it probably doesn’t feel like anything to be a rock tumbling down a hill. What is it like to be LaMDA? Probably, I think it’s very likely that there’s nothing at all it’s like to be LaMDA. I don’t rule it out completely 100%. But it’s probably not like anything to be LaMDA, as far as I can tell. ⬆

The Slightly Conscious Tweet

Michaël: So, LaMDA is one of the first large language models that people speculate about its consciousness. Earlier this year, we had Ilya Sutskever saying it may be that today’s large language… So he said large neural networks are slightly conscious, and this created a whole other debate. So What was your reaction when you saw Ilya’s tweet?

Robert: I remember my first reaction being like, “Oh man, do I need to write something about this? Should I have a take on this?” And in contrast with the Lemoine thing, I actually didn’t really start tweeting that much about it then. I lied low during that stage of the discourse. I should say, I think that also affected the way people reacted to the Blake Lemoine thing: there had already been consciousness wars. The first salvo happened earlier this year.

Michaël: What’s the war about? Who’s fighting whom?

Robert: A lot of different… Well, I don’t know. I don’t want to necessarily frame this as a war. I guess on the one hand, you had people like Ilya who think it’s perfectly appropriate and fine to speculate about the consciousness of AI systems. And then you have people who say, and I think this is a reasonable perspective, “No. It’s very likely that they’re not conscious, so let’s not talk about it.” But then there also spins on this where people are like, “It’s really harmful for a variety of reasons to even talk about the question of AI consciousness, because it’s a distraction for more important issues with AI, like governing it, regulating it, mitigating bias and things like that.”

Robert: It seemed like people were like, if AI companies talk about this, they’ll distract the public from the real issues, and so it’s bad that people are talking about it. There’s another line of thinking, which was people at AI companies are speculating about consciousness because it makes it seem like they’re building really impressive things. It’s part of the hype cycle that they do. Who knows what people’s motivations are? My guess is that the reason that people end up tweeting about this and wondering about this is just that I think it’s just a fairly natural thing for people to wonder about when they interact with these technologies. Even setting aside any Silicon Valley hype attitudes, this is something people have always wondered about AI. ⬆

Could Large Language Models Become Conscious?

Michaël: Do you wonder about the consciousness of large language models?

Robert: Yeah, I’m trying to think. So I’ve been working on this issue for a bit over a year or so.

Michaël: So before Ilya?

Robert: Yeah, Before it was cool. It was my job to wonder about it. I’ll say this. Large language models would not actually be the first place that I would look if I was trying to find the AI systems of today that are most likely to be conscious in a way that we would morally care about.

Michaël: Where would you look for that?

Robert: Maybe more agent-like RL based things that move around in an environment and take actions over a larger time scale. Large language models have properties such that I think it would be kind of weird if consciousness and sentience popped out of large language models. Again, I don’t rule it out. But they have properties such that it would be weird if it popped out. And also, the nature of their conscious experience, I think, would be really weird because they don’t have bodies. They only have one goal at base, which is next token prediction. Maybe in the course of doing that, they would be conscious of some weird stuff as they spin up agents. But yeah, I think if they’re conscious, what they’re conscious of would be, I think, a very strange thing because they live in this world of pure text, whereas if we were looking for things a little bit more human pain or human suffering, I think we’d probably look elsewhere.

Michaël: Well, okay. So then there are language models and actual… I don’t know if the definition requires characters from English text or it can be tokens from other inputs, if it can be just a string of a binary file or a string describing some image. But yeah, I guess most people think of text. Do you think you would require actual human inputs to get to the consciousness human have? So visual, auditory, olfactory, other things?

Robert: I think it would be too limiting to say the only things that can have subjective experiences are things that have subjective experiences of the kinds that we do, of visual input and auditory input. In fact, we know from the animal world that there are probably animals that are conscious of things that we can’t really comprehend, like echolocation or something like that. I think there’s probably something that it’s like to be a bat echo locating. Moles, I think, also have a very strange electrical sense. And if there’s something it’s like to be them, then there’s some weird experience associated with that. So yeah, no, I think AI systems could have subjective experiences that are just very hard for us to comprehend and they don’t have to be based on the same sensory inputs.

Michaël: Do you think there was something else to the story besides just Ilya tweeting about it? Do you think was maybe debugging a bigger model? Do you think this was a marketing move from OpenAI? I don’t say it was. I’m not pointing at it. I’m just hearing some other people making conspiracy theories about it.

Robert: Yeah, of course. I don’t know. I wouldn’t want to speculate too much about his motives, but I think a default hypothesis that should be taken pretty seriously is that he tweeted it because he was wondering about it, and he was wondering about it just because of some visceral sense he was getting from interacting with GPT4, or who knows what he was doing around the time that he tweeted that. One thing that’s also very complicated in this and also funny is no one really knows what he meant by that tweet. We don’t know if he was talking about phenomenal consciousness, if he was talking about subjective experience or if he was talking about world modeling or if he was talking about some kind of intelligence.

Robert: I’m not sure what he meant by the term conscious in that. And then he also just did not follow up with any evidence or what his reasons were, which is, I think, part of also why it set off such a flurry is people were able to supply, read into it, whatever debates they wanted to have. So yeah, I think I don’t want to let people off the hook for hyping up AI past its capabilities, but I think one reason they might do it is just because that is the sense they’re getting, rightly or wrongly. ⬆

Blake Lemoine Does Not Negotiate With Terrorists

Michaël: I think there’s also that Blake Lemoine was a bit controversial at Google. He had been in the news for other things as well. Do you remember what was his backstory if he did something else?

Robert: This was a really interesting thing to find out that I don’t think was initially reported is he had been the subject of a 15 minutes of fame, where by fame, I mean a right wing outrage cycle. Various people at Breitbart or Daily Caller were like, “Oh finally, here’s a story that I can file.” And it’s about this Google engineer, and yeah, this is what he got in trouble for. I think this was also on some Google internal list or… There was some debate about, I think it was about content moderation, but it ended up being a political debate.

Robert: And in the course of that, he referred to a Republican Congresswoman, who’s now a Senator, as a terrorist. Because his colleagues were like, “Oh, well, where do we draw the line in terms of what should be moderated and what’s appropriate and stuff?” And at one point, he’s like, “Well, my stance is we shouldn’t negotiate with terrorists.” I think the reason he called her a terrorist was because of her advocacy for this… There’s this anti-human trafficking bill that came out a few years ago and Republicans were very much in favor of it. And people say that it was a way for them to crack down on sex work, and people who advocate for sex work were very opposed to this bill. Anyway…

Michaël: So was he opposed or not to sex work?

Robert: Well, he was opposed to this bill that this Congresswoman was pushing for and releasing ads about. Yeah, he was opposed to that bill because he thought it was an anti-sex work, crack down on sex work, Republican nefarious thing, which that I think is a thing that a lot of people on the left think. I myself have not really looked into that object-level debate. For our purposes, yeah, what matters is that because of that, he called her a terrorist, and then I guess someone must have leaked that to the right wing press, and that allowed people to get the clicks that they needed for the week. “Google engineer calls Republicans terrorists”. It’s a good headline for Breitbart News or whatever.

Michaël: And then he got more headlines with the LaMDA article. And people didn’t really take it seriously because it was, again, the one saying something where… Do we know if he was making any other claims than just this model being sentient? What did he tweet? Just like, “Oh, I think this is sentient.” No, I think the problem was him talking to his higher ups, right?

Robert: Yeah.

Michaël: But did he make any claims to his higher ups on top of just “This is sentient”, maybe just there being some moral value in it?

Robert: Well, yeah, “I think there is moral value in this thing and we need to protect it and treat it as a person.” That definitely was the core of his claim. I don’t know if there was more than that, but that’s already a huge claim.

Michaël: He was pointing at the fact that this was smart, that this was conscious and it has some personhood, some identity, and we had the responsibility to protect it. Maybe we can just define sentience compared to consciousness.

Robert: And again, there are lots of ways of slicing these, but it’s just how I like to slice it. So consciousness, as I was saying, I use that to just pick out having subjective experiences of any kind, so visual experiences, like we’re having right now, auditory experiences. And then sentience means having the capacity to have conscious pleasure or pain or states that feel good or bad. In animals, those usually come together. So dogs, if they’re conscious, can have visual experiences, but can also experience pleasure and pain. Conceptually, I think you can imagine creatures that maybe have visual experience or something, but they don’t experience pleasure or pain.

Robert: So it could be that some advanced large language models might be conscious, but not sentient. The reason sentience is an important category to think about is because for a lot of people, I think including me, it seems like sentience is very important for moral standing. Peter Singer is famous for arguing this as well as classic utilitarians like Bentham. Bentham famously wrote, the question we should ask about animals is not whether they can reason or whether they can talk, but we should ask if they can suffer. And if they can suffer, that’s what makes them the sort of things that we need to protect.

Michaël: The suffering comes from the sentience, comes from the valence of their experience.

Robert: There are, I guess, different ways of defining what suffering is. But I think for a lot of people, it also seems that the kind of thing that would be important would be something that you’re consciously aware of that feels bad, so like conscious suffering. Yeah.

Michaël: So I think there was a different take on Twitter, outside of just like, “Oh, this guy is funny.” It was, “Maybe he’s saying something weird and this is too much for the situation, but we might have some systems that are more complex than that in the future and we might need to start thinking about the artificial sentience of those objects before we are at a point where we might need to assign more value and rights to them.” Do you think this was basically what smart people were saying on Twitter? Not smart, but the contrarian take?

Robert: Well, yeah, not to be overly diplomatic or something, but I think smart people were saying all sorts of things, including stuff I vehemently disagree with. I will say that’s the take I agree with, what you just said. People I saw having that take included Brian Christian, the author of The Alignment Problem and Algorithms to Live By, a friend of the alignment community. He had a great article in the Atlantic saying basically this. Basically, maybe LaMDA’s not sentient and maybe Lemoine wasn’t thinking about it that well, but people shouldn’t also say that we totally understand consciousness and there’s no question here and the question’s not important, because consciousness is this extremely hard scientific problem. I think people who have extremely confident theories of consciousness are probably just being wildly overconfident. That’s the take I agree with and definitely seems like an important thing to think about. Regina Rini, another NYU philosophy grad, now a professor at York University, had a good piece like this. So those are some takes that I would point people towards.

Michaël: Do you think there should be more philosophers thinking about artificial sentience?

Robert: I’d certainly like to see that. I maybe should have takes, but I don’t have really strong takes about exactly how much it should be prioritized relative to other things. I will say that I would love to see people who are already thinking about consciousness, the philosophy of consciousness, the science of consciousness, I’d love to see more of those people think specifically about AI consciousness, even with just a little bit of their time. One, I think that because it’s important to think about AI consciousness. And two, I think it’s probably helpful for the general project of figuring out consciousness to try to specify our theories to the extent that they might be able to apply to AI systems, and just say very clearly why certain theories do or do not predict that certain AI systems are conscious. ⬆

The Philosophy of Consciousness

Could We Actually Test Artificial Consciousness?

Michaël: So if they predict that certain systems are conscious or not, do you think there’s a way to verify those predictions?

Robert: That’s a great question. So maybe not directly. So one thing I’ll say is this question that we face with AI systems, it’s pretty similar to the question that we face with animal consciousness. And people who study that scientifically and philosophers who think about that have already run into a lot of the same issues that the LaMDA case raises. Namely, it’s we never directly confirm consciousness. Most of what we can do is make a good inference that based on something’s behavior and based on its organization and then based on what we know about consciousness from the human case, then yeah, the best explanation is that thing is conscious.

Michaël: How would we know that it’s not a p-zombie that’s exactly the same behavior?

Robert: Great question. So p-zombies are-

Michaël: Maybe define it.

Robert: Yeah, yeah, exactly. I’ll say, yeah, that term comes from David Chalmers or at least was popularized by him. Pretty sure it is him. Certainly, he’s very famous for talking about them a lot. P-zombies are like-

Robert: He’s very, very famous for talking about them a lot. P-zombies are hypothetical creatures that would be physically identical to you or me, but would not be conscious. Philosophers think about p-zombies because if you think that thought experiment is coherent to even imagine, which there’s a lot of disagreement about that. But if you think it’s coherent to imagine such things, then that shows that consciousness doesn’t metaphysically follow from the physical facts. And so maybe it’s something over and above the physical. But importantly like David Chalmers does not think that in this universe, in this world, there are probably p-zombies running around. He thinks that no, obviously in this world, if you have a physical brain, that’s enough to be conscious. He just thinks that there’s some metaphysical difference between the physical and consciousness and there must be some sort of bridging laws between physical facts and phenomenal facts.

Michaël: There’s something about the redness of red. I don’t fully remember his text. I think most of the thing I read about it was from Yudkowsky’s reply to Chalmers. I don’t really remember what Yudkowsky was saying as well, but I guess it was criticizing the fact that there could be a different substract for consciousness. It was outside of physics, it is basically what he was saying. I don’t remember.

Robert: Yeah, I could get this wrong, but I think what Yudkowsky on Chalmers is pointing out is, suppose you think that p-zombies are a coherent thing to imagine. Which Chalmers does and uses to argue against physicalism. Yudkowsky is pointing out that p-zombies by definition, talk about consciousness exactly the same way that we do. They say things like, “wow, I’m conscious of this light.” It’s kind of weird that consciousness doesn’t seem like a physical phenomenon.

Robert: In fact, p-zombies then sit down and write all these philosophical theories about consciousness and how it’s mysterious. So what Yudkowsky is pointing out is that this is a good kind of argument against dualism, that it’s really weird to have this thing that you call consciousness that doesn’t seem like it can really make a difference to people’s beliefs and behaviors, since by definition p-zombies that don’t have consciousness, do exactly all the same things. I think that’s what happens in Yudkowsky’s post called Zombies. Zombies? ⬆

From Metaphysics To Illusionism

Michaël: I think there was something about like, if they don’t have anything related to consciousness, but they still produce the books. Why did the word consciousness or the concept appear?

Robert: So what people who are physicalist about consciousness, i.e., people who think it’s just some physical phenomenon, they can ultimately be reduced to physics or is some high level thing that just depends on physics. The thing that they can say is, well, we have an explanation of why consciousness makes a physical difference because it is a physical thing and So you can take the Yudkowsky thing as an argument for being a physicalist about consciousness.

Robert: I guess one thing I might say at this point is if you’re a physicalist about consciousness or a dualist like Chalmers, you still have an open question about whether Lambda is conscious. Because you still have to answer the question of what arrangements of physical matter or what computations are conscious. And that’s still very much an open question in the science of consciousness.

Michaël: So I thought that physicalism adds some strong assumptions about what substrate was required. That it was something about mesh and our brain. And that for instance, computations on a computer or on a paper was probably not conscious. And that the people who thought that computation would be possibly conscious are computationalists. So, Do you think physicalists mostly think about flesh and blood, or do you think they also think about computers?

Robert: So whether you think that computers can be conscious, it actually kind of crosscuts the physicalist versus non-physicalist debate. Also apologies to all of my philosophy instructors, if I’m messing up the metaphysics of consciousness 101, but you could be a physicalist about consciousness i.e., think that once you fixed all of the physical facts, that fixes all of the computational facts, but think that what matters at a high level is what computations are being performed.

Robert: Just because money can be printed on different kinds of paper doesn’t mean that money isn’t ultimately a physical phenomenon. And so for a computationalist about consciousness, you could just be a physicalist, but you think that the level of description that matters for whether something is conscious or not is what computations are being done. And not the substrate that’s doing the computations. So that’s how you can be a physicalist and a computationalist. Some physicists are not computationalists. Ned Block. Also one of my advisors, is an example of someone who’s a physicalist and thinks that the substrate matters. And then David Chalmers is an example of a non-physicalist who thinks that computers can be conscious. And in fact is one of the main people who’s argued for the possibility of consciousness in Silicon computing systems.

Michaël: And what is Robert Long? What are you, physicalist or not?

Robert: Ooh, I’m definitely not any one thing. I’ve got a credal distribution. I mean, one thing that I think sometimes surprises people who know that I’m working on this is I don’t spend that much time thinking about the metaphysics of consciousness. Which is where you get physicalism versus dualism versus panpsychism versus idealism. And the reason I don’t spend that much time thinking about it is, I don’t think it affects the science of consciousness that much. Because you can have any of those metaphysical views. And then you still just have the question of like, okay, but which systems are conscious? Which ones give rise to consciousness?

Robert: Since working on this, I have become more sympathetic to illusionism about consciousness, which is a kind of surprising radical view that phenomenal consciousness actually doesn’t exist. I used to think that was just a completely absurd non-starter. I think now I’ve kind of understood a little bit better what those people are saying.

Michaël: What’s phenomenal consciousness?

Robert: That’s just what people call consciousness in philosophy to specify that they’re talking about this subjective experience thing.

Michaël: Okay.

Robert: But then I also have a lot of sympathy for just good old fashioned physicalism, which is like, consciousness is real. And it’s some physical phenomenon. And within that, I am pretty sympathetic to computationalism.

Michaël: So you’re more familiar to illusionism, which is basically, there’s no subjective experience at all?

Robert: Yeah.

Michaël: So basically this hypothesis is that the entire work we’re doing doesn’t really make sense.

Robert: So great question. Sort of? I talk about this in my post. I think that even if illusionism ends up being true, I think it still actually makes sense to try to make a scientific theory of consciousness. Because as we find out more, we might revise that theory and be like, oh, we weren’t actually really looking for consciousness because that ended up being kind of a confused concept. But I think something in the neighborhood of consciousness is probably very important for moral patienthood.

Robert: So sometimes illusionists say things like, oh, well, consciousness doesn’t exist, there’s no open or interesting or confusing questions here. But I don’t think that’s true. Illusionists still presumably think that some things suffer and other things don’t suffer. And if they think that, they should come up with some sort of computational or physical theory of which things suffer and which things don’t.

Robert: So my approach is to kind of… let’s just assume that consciousness exists and then if it doesn’t exist, something like it or in the neighborhood might exist. And we will have learned a lot. Another reason I assume that consciousness exists is because I think illusionism is more likely than not to be false. Or I think consciousness probably does exist, which is another reason I make that assumption. ⬆

How We Could Decide On The Moral Patienthood Of Language Models

Michaël: So what’s a moral patienthood that you mentioned? Just like assigning some moral value to something?

Robert: Yeah.

Michaël: To something in the universe?

Robert: Roughly moral patients are the things that we need to take into account when we’re making moral decisions, things that matter for their own sake. I think this also comes from Peter Singer. If you kick a rock down the road, doesn’t matter. Unless you hurt your foot. But it doesn’t matter to the rock. If you kick a puppy down the road, that does seem like it matters. And that’s because a puppy is like a moral patient.

Robert: And yeah, there’s a lot of different theories of what should make us think that different things are moral patients. So I’ve been talking a lot about sentience, but you might also think that things that can have desires that can be frustrated or satisfied, that those are the things that are moral patients. You might think that rational agents are moral patients. Those are like ethical, philosophical questions about what the basis of moral patienthood is. And then once we’ve specified one of those things, then we have scientific and conceptual questions about, okay, well, what sort of systems have this or that? What sort of systems can have desires? What sort of systems can be conscious?

Michaël: How do you approach this scientifically? Like how do you go and take your pen and run experiments and be like, oh yeah, this person has desire. This particular rock seems like kind of having some sort of experience.

Robert: Well, I’ll tell you how other people approach it scientifically, because I myself have not run these experiences. As I understand it, what happens in the scientific study of consciousness is, first of all, you usually start with the human case. Because humans are the things that we know are conscious and they can tell us when they’re conscious and things like that. And then there’s various things that happen to people that manipulates whether or not they’re conscious. So you can get a brain lesion. And that will give you weird blind spots or manipulate your conscious experience in certain ways. And that might give us clues about what parts of the brain are responsible for various aspects of conscious experience.

Robert: You can flash things on a screen at different rates. And depending on what rate you flash it at, people will or will not be conscious of it. And then you’re scanning their brain.

Robert: So yeah, first off in the human case, we track how a conscious experience is changing. And then we look at how the brain is changing and then that gives rise to different theories of what parts of the brain or what aspects of neural processing have to do with conscious experience or not. And then what we would like to be able to do is apply that to animals and to AI systems, to have some sort of guess about if they’re doing the same sort of things that seem to be associated with consciousness in us.

Michaël: Is the idea that we try to see if humans are conscious by looking at what’s happening in the brain, maybe the computation performed by neurons or like a specific area of the brain. And then we could map it to digital systems, either brain emulations or large neural networks and see if we see the same patterns or same behaviors in the brain and in neural networks?

Robert: That is kind of how I think about it.

Michaël: So what do we have as evidence for sentience? So imagine you are in 2022 and you just run into someone that says, hey, I’m Rob Long from the future. I come from 2030. And apparently some systems there are sentient. And it gives you a bunch of evidence of sentience of neural networks or AI. What would be the thing that you think would be convincing?

Robert: Yeah, I really wish I had that list already, but I can tell you, I think, what sort of form that evidence might take. First of all, I think future Rob Long would say we made a lot of progress in the science of consciousness in general. And so we kind of converged on a theory for the human case. And he would be like, and the correct theory is X. And by the way, I think X would probably be maybe not a theory that currently exists. Maybe it’d be some kind of combination of the ones we already have. And it would definitely be a lot more detailed than the theories that we already have.

Robert: And then he would be like, oh, and we also got awesome interpretability tools and yeah, it seemed we were able to map and discover a lot of structurally similar computations going on inside AI systems. I think you would also say these AI systems, furthermore, believe that they’re conscious and talk like they’re conscious and it seems like it’s playing the same role in their artificial mental lives.

Michaël: So the same structure as what we observed in the brain and the same behavior, the same way of answering questions?

Robert: And one way to draw this back to the LaMDA case, right. I think where Lemoine went wrong is he was just looking at behavior. And firstly, I don’t think he was looking carefully enough at behavior because if you look at the full range of behavior, so Rob Miles, who’s been on the podcast, he also talked to GPT-3 and he was like, hey, let’s talk about the fact that you’re not sentient. What do you think about that? And you know, GPT-3’s like, great point. I am not sentient.

Michaël: Just like a good friend, saying exactly what you want him to say.

Robert: Right. Which is a lot of what is happening with language models. So one, he wasn’t looking carefully enough at the behavior. And if you look carefully enough at the behavior, that alone is kind of some evidence that it’s not sentient. But secondly, you can’t just look at the verbal output. You also need to ask what’s causing it and what sort of computations are underlying it. And so I think understanding more about that part would be a part of a mature science of artificial sentience and consciousness.

Michaël: So I agree with language models, maybe just how they’re built is not very useful for discussing with them. It is very hard to discuss in a way that they will say that you’re wrong. They will never say something that you don’t expect because they’re trying to maximize the likelihood of you being happy with the completion. But I think this LaMDA case was kind of funny on Twitter, but it is kind of a distraction, as you said, from actual issues that they’re having now or important issues from the future. ⬆

Towards A Science Of Consciousness

Predictive Processing, Global Workspace Theories and Integrated Information Theory

Michaël: So we discussed the metaphysics of consciousness. You said you don’t spend most of your time thinking about this. We can talk more about the different actual theories, scientific theories of consciousness and the ones you’re most excited about. Do you have a couple of ones you think are actually about consciousness, not the physics and you think are useful or convincing?

Robert: Yeah, definitely. There are definitely ones that I think are useful. I think most kind of like quote unquote neutral observers of the science of consciousness aren’t fully sold totally on any existing theories. Luke Muehlhauser has a lot about this in his report. But most of these theories are just not yet mature enough. They’re not fleshed out enough. They’re not specific enough. So there’s none that I would say I’m convinced of, because I think we just don’t yet have the full theories. Yeah.

Robert: I could talk a bit about what the most popular ones are in the science of consciousness. And maybe say which ones I think are off to a good start. So yeah, there was a survey of people at the Association for the Scientific Study of Consciousness and I asked them what their favorite theories were. And the leaders were Predictive Processing, Global Workspace Theory, Higher Order Theories, Recurrent Processing Theories and Integrated Information Theory.

Robert: Now I don’t know if we need to go through all of those, but I don’t know. Maybe I’ll say a little bit about each. Predictive Processing, readers of Slate Star Codex might be familiar with this. He’s written a lot about this. It’s not really a theory of consciousness per se. It’s this general framework for thinking about what the brain’s doing as minimizing prediction error, but it could be turned into a theory of consciousness. You could try to apply that framework to consciousness.

Robert: Global Workspace Theory is this theory on which the mind is made of these separate subsystems that do different kinds of things. Usually they don’t interact with each other. So there’s a modular subsystem for decision making and ones for different sensory modalities. But there’s also this global workspace that sometimes contents from each of these systems gets broadcast to that and that makes it globally available. So this is meant to kind of explain how sometimes you process some information unconsciously, but sometimes you do become aware of it. Their theory of that is that what it is to be aware of something is for it to be broadcast at the global workspace.

Robert: Higher Order Theories, similarly, are kind of about explaining why sometimes you’re conscious of something and other times you’re not. And they think that to be conscious of something is for there to be some sort of higher order re-representation of something that was already represented. So like a re-representation of visual information or something like that.

Robert: The next theory is Recurrent Theories. Those are usually contrasted with higher order theories. They think that all it takes for you to be visually conscious is for there to be some sort of recurrent processing of visual information in the back of the brain. I should just say, I don’t know as much about recurrent theories. So that’s a little tagline. Victor Lamme is maybe the most, one of the main proponents of this. So you can Google that.

Robert: And then IIT, integrated information theory, that gets like a lot of discussion online and kind of like in these circles, because it’s this very elegant mathematical and kind of counterintuitive and cool theory of consciousness where consciousness is about having integrated information in a system. And IIT is different from these other ones in that it is already very mathematical and purports to give, at least in principle, a measure of how conscious any system is in terms of its integrated information. I feel like listeners to this podcast are probably also familiar with Scott Aaronson. Scott Aaronson wrote this very famous critique of integrated information theory that offered this counterexample or purported counterexample to IIT. Anyway.

Michaël: What’s the critique?

Robert: He took the formula or the procedure that IIT has for assigning a level of consciousness to any system. And then he defined this expander graph. I don’t really understand exactly how this worked, but what he did is he defined a system that would be not intelligent at all. And in fact would barely even do anything, but that would be extremely conscious according to the measures of IIT. In fact it could be unboundedly conscious. And this was just meant to say, yeah, either IIT has to kind of bite the bullet and it does just predict that you can just have insanely conscious, but not particularly intelligent things or Scott Aaronson was just saying this just kind of seems like a weird and bad prediction of the theory.

Michaël: So how does IIT measure consciousness?

Robert: So I am not really up to speed on exactly how IIT does this. In part because I’m fairly convinced by people like Aaronson and also some more recent critiques by some philosophers that IIT is probably just not on the right track. But it defines information in terms of the causal interaction between different parts. And then there’s this notion of integrated information, which is something like, yeah, I don’t know how integrated this causal interaction is within a system. And that explains why, and to what extent different systems are conscious.

Michaël: From these like four or five defined, which one do you think is the most useful to think about artificial sentience?

Robert: So I’ve been looking more in the neighborhood of global workspace theory, higher order approaches. And then another thing called the attention schema theory of consciousness by Graziano. And I think there are reasons, A few reasons these are kind of helpful. One, they’re kind of put in computational terms and that’s useful if you’re looking for theories that could apply to AI systems. When people study these, they’re obviously looking at different brain regions, but the way that the theories are formulated is often in terms of stuff that could just as well be done in Silicon, maybe.

Robert: Another thing about these theories is that they seem to do a good job of explaining or at least start to explain why conscious creatures would tend to believe that they’re conscious and say that they’re conscious. Luke Muehlhauser, again, talks about this in his report, but one thing you want your theory of consciousness to also do is explain why we think we’re conscious and why it’s kind of puzzling and things like that.

Robert: And attention schema of theory in particular was formulated kind of with this question in mind. This is sometimes called the meta problem of consciousness, explaining why we think we’re conscious and why it seems kind of weird. Attention schema theory is at least confronting that very directly. And I think that’s a good thing for a theory of consciousness to do. I also have just extremely wide confidence intervals on all of these things. So some of these things I’ve sort of just ruled out because I have, or not ruled out, but I just haven’t looked at them as much and things like that. ⬆

Have You Tried DMT?

Michaël: So yeah, I get what you mean about global workspace theory, higher order theories and everything, but have you tried DMT?

Robert: I see you’ve learned well from Joe Rogan. Well look, you’re either interested in consciousness or you’re not. So figure it out for yourself. Actually, I have not, but you know, it does seem relevant to our ultimate theory of consciousness. Should be able to explain why DMT manipulates it in such an intense way. And look, if you’re interested in the intersection of DMT and consciousness studies, then the Qualia Research Institute is definitely the place to look.

Michaël: Is this a place to learn about our relationship between psychedelics and consciousnesses?

Robert: I mean, I definitely, I think few people have really thought as much about, and also done the work to understand psychedelics as QRI.

Michaël: So I think one thing psychedelics teach us is that our subjective experience can be different just taking another substance. And so there are other states of consciousness that are actually not that far away, that we can just go for long periods of time just by lightly modifying our substrate. So that could even be a lesson for how different state of consciousness could be in computers. They could be totally alien just by changing a little bit of the substrate of the computation.

Robert: I think that really is a serious lesson that psychedelics can teach us. There’s other stuff too, where, I’ve never really seen really rigorous stuff on this. I mean, because I think by its nature, psychedelics can make people less rigorous. And it’s also just hard to talk about rigorously. But people say that they teach us things also about valence and suffering and the self and things like that, which obviously would ultimately be relevant to thinking about AI consciousness. ⬆

Is Valence Just The Reward in Reinforcement Learning?

Michaël: So valence, as we said earlier, was kind of positive or negative valence of experiences, right. So pain would be negative valence and pleasure would be positive valence. Do you think this could be something that an AI would have if it had positive reward, negative reward?

Robert: So I think there’s probably some connection between reinforcement learning and valence. I mean, there’s a few reasons to think that would be the case. One is that it’s pretty common, well pain and pleasure can be reinforcers for us. And definitely help us learn what to avoid and what not to avoid. Similarly to how reward helps artificial agents learn. That’s one thing. There’s also really good fleshed out neuroscience theories about reinforcement learning in the brain. And in particular about the role of dopamine, that dopaminergic neurons are computing reward predictions in the brain. So that’s some evidence that they’re closely related. But it’s actually probably much more complicated than just positive reward is pleasure and negative reward is displeasure. It seems like you need a lot more to explain pleasure and pain than just that.

Robert: So one thing that Brian Thomasik has talked about and I think he got this from someone else, but you could call it the sign switching argument. Which is that you can train RL agent with positive rewards and then zero for when it messes up or shift things down and train it down with negative rewards. You can train things in exactly the same way while shifting around the sign of the reward signal. And if you imagined an agent that flinches, or it says “ouch” or things like that, it’d be kind of weird if you were changing whether it’s experiencing pleasure or pain without changing its behavior at all. But just by flipping the sign on the reward signals.

Robert: So that shows us that probably we need something more than just that to explain what pleasure or pain could be for artificial agents. Reward prediction error is probably a better place to look. There’s also just, I don’t know, a lot of way more complicated things about pleasure and pain that we would want our theories to explain.

Michaël: What kind of more complex things? ⬆

Are Pain And Pleasure Symetrical?

Robert: One thing is that pain and pleasure seem to be in some sense, asymmetrical. It’s not really just that, it doesn’t actually seem that you can say all of the same things about pain as you can say about pleasure, but just kind of reversed. Like pain, at least in creatures like us, seems to be able to be a lot more intense than pleasure, a lot more easily at least. It’s just much easier to hurt very badly than it is to feel extremely intense pleasure.

Robert: And pain also seems to capture our attention a lot more strongly than pleasure does, like pain has this quality of you have to pay attention to this right now that it seems harder for pleasure to have. So it might be to explain pain and pleasure we need to explain a lot more complicated things about motivation and attention and things like that.

Michaël: How would you define precisely valence? Because it seems, as you said that negative valance, pain is somehow more acute, more, oh, I need to solve this. Where pleasure or let’s say happiness would be something much more diffused. So is there something about complexity or how narrow is your distribution, how narrowing time is the pleasure?

Robert: I certainly don’t have a precise theory of these things and I would really like one to exist. So I guess I can say a few things about why it might have these features or what we should be looking for. One way you could maybe explain the fact that pain can be a lot more intense and attention capturing is that it’s just a lot easier to lose all of your future expected reward than it is to gain a lot in a short amount of time. So very, roughly speaking, if we’re thinking of evolution as building things that are sensitive to expected, future offspring say, it’s just very easy for you to lose all of that very quickly if you’ve broken your leg or something like that. And so it’s like take care of this now, things are going really wrong.

Robert: Whereas, there are fewer things, like when you eat, it does feel good, but it doesn’t feel insanely good because it doesn’t seem like that’s massively increasing all at once your expected offspring. So if you’re thinking about artificial agents, you might want to think about the distribution of expected rewards and how common things are in their environment that can drastically raise them or drastically lower them. And if we crack this, I think David Pierce, this transhumanist guy, has talked about hopefully being able to have agents that just operate on bliss, gradients. He says…

Michaël: Maximizing hedonistic treadmill through bliss gradients.

Robert: Oh, is that the name of the paper or something?

Michaël: Oh no. I think he was the one who introduced the hedonistic treadmill. That is when people go back to their level of happiness they’re most used to. So they never, you don’t actually go far away up and stay up. You just go back to your, hedonistic treadmill.

Robert: And that’s another thing about being creatures like us, it seems like evolution doesn’t want us to stay content for too long, but, you know, with future technology and maybe with different sorts of agents, yeah, you could have things that don’t, their valence doesn’t have that structure. And yeah, he’s imagining things that just feels really good all the time. And then if they put their hand on a hot stove, they go from feeling insanely blissful to moderately happy and then they remove their hand in response to that. Anyway, that’s just far out transhumanist stuff, but I think it’s cool to at least have in mind.

Michaël: I think transhumanists care about some sometimes minimizing pain, removing the pain from our system at all, and also modify, you know, how we experience less, maybe inject less directly into our brain. So maybe there’s a difference between native deterrence would want to minimize specific experience for everyone or just, if I could just give myself a billion reward in my brain right now would be something to do or not. But that’s kind of off topic for AI. You wrote extensively on your new substack about artificial sentience. And one of the piece I liked most is Key Questions About Artificial Sentience: An Opinionated Guide and we like opinions here. So at some point in your article, you talk about why we should build a theory of consciousness that explain both biological consciousness and artificial sentience. So computational consciousness. So why do we need something that encapsulates both?

Robert: I mean, I think one thing is that our theories have to start off with things that are biological systems. So that’s one thing. As I was saying earlier, our data on consciousness is going to come in the first instance from us, then there’s a question of in explaining our consciousness, are we able to do that in computational terms? And there is some disagreement about that. But yeah, then if we can do it in computational terms, then it just sort of, I think necessarily also applies to artificial systems.

Michaël: Or we could have something that, maybe our consciousness might be very different from, let’s say a large language model consciousness. And so we might never find anything that explains both. Right. So maybe computationlism explains large language model consciousness and physicalism explains our brain.

Robert: I think one of the deep dark mysteries is there’s no guarantee that there aren’t spaces in consciousness land or in the space of possible minds that we just can’t really comprehend and that are sort of just closed off from us and that we’re missing. And that might just be part of our messed up terrifying epistemic state as human beings.

Michaël: Oh, so you mean there’s like a continuum of consciousness experience and humans are in this space and computers maybe in this other space. And there’s like a whole lot of other stuff we don’t really know.

Robert: I mean that, I think that’s, yeah, that’s possible. And I think one way you can think about these questions of like the value of the future and how likely is AI to be conscious. It’s kind of, you can imagine this is Aaron Slomon’s term, but a lot of people have used it. Like what’s the space of possible minds. The orthogonality thesis is about the space of possible minds, right? It’s about how intelligence and values, how much can they vary in the space of possible minds, but you can also wonder how much intelligence and conscious experience can vary. You can also wonder how much conscious experience can vary, like how different can sorts of experiences be.

Michaël: Can you explain the Orthogonality Thesis for people who have not read Bostrom?

Robert: Yeah, absolutely. Well, there’s a lot of variance on the orthogonality thesis, but roughly and maybe incorrectly it’s, I think, the Bostrom version is in principle, any level of intelligence is compatible with any set of goals. Basically it’s saying just because something is very smart doesn’t mean it would necessarily have human-like goals. It could be very smart and have the goal of, for example, maximizing the number of paperclips.

Michaël: And I think that’s a problem for people who originally thought that, if something is smarter than us, it would have higher moral standards. That’s the, let’s say the wrong take that people had before. And I think he argues for that, oh, you could basically have any utility function or any goal with any level of intelligence. And I guess there’s maybe some counter arguments for very stupid edge cases where you have something that’s not smart enough could not have a very smart goal. So yeah, you need to have something smart enough to have this kind of goal right. To implement it. But I guess it argues that you can have basically anything on this like 2D graph of how smart is your goal and how smart are your agents.

Michaël: And I think this is kind of useful for thinking about a world where we have AIs that are able to wander around and sometimes have subjective experiences. And maybe you put their hands on a stove and possibly suffer a lot. And one thing you write in your article is we should also take care to avoid engineering catastrophes for AI systems themselves, a world in which we have created AIs that are capable of intense suffering which we do not mitigate whether through ignorance, malice, or indifference. Why did you write this? Did you actually care about the pain of AI systems?

Robert: I mean, I care about the… I think I care about the pain of anything that’s capable of feeling pain, setting aside questions about how to compare it and how I rank different kinds of pain. Yeah, I think just as a lot of people have argued that we should care about animal pain, even if it occurs in animals, if it’s possible for AI systems to feel pain, I would care about that. So, yeah, one thing I say in that piece and something I’m still trying to think out, think about is how to do cause prioritization for this problem. It depends on a lot of really tough questions like how many systems are there going to be? How likely are various scenarios, how does it compare to the problem of AI alignment and things like that? ⬆

Digital Minds

From Charismatic AI Systems to Artificial Sentience

Robert: And another thing I try to emphasize and like Blake Lemoine has made this very vivid, I think. So there’s like problems of false negatives where we don’t think AI systems are conscious or we don’t really care about them. And then we have something like factory farming. But there’s also a lot of risks from false positives where people are getting the impression that things are sentient and they’re not actually sentient. And it’s just causing a lot of confusion. And people are forming like religious movements or… I don’t know, I think things could get just very weird as people interact more with very charismatic AI systems that whether or not they are sentient, will give the very strong impression to people that they are.

Michaël: So do you think moving forward will have increasingly complex AI systems able to fool people into believing they’re sentient and society will care more and more about the systems? Do you feel like a bill in Congress about Digital Minds in 2024?

Robert: 2024 sounds kind of early. It it’s weird. I feel like it can go… It’s very hard for me to have a concrete scenario in mind, although it’s something that I should try. So I think some evidence that we will have a lot of people concerned about this is maybe just the fact that Blake Lemoine happened. He wasn’t interacting with the world’s most charismatic AI system. And because of the scaling hypothesis, these things are only going to get better and better at conversation.

Michaël: What’s the scaling hypothesis?

Robert: Well, I guess it can mean a lot of different things, but it’s that with certain architectures, if you just keep increasing either the size of the models and, or how much data they train on, like we are still seeing increases in capabilities. And I guess the strong scaling hypothesis(https://www.gwern.net/Scaling-hypothesis#scaling-hypothesis) is that will scale us all the way to AGI, but I was just talking about the weak scaling hypothesis. It seems like large language models are going to be getting better and better at least for the next few years. I guess we’ll see when GPT-4 comes out.

Michaël: But you seem to have a lot of information about this.

Robert: I have absolutely no information about GPT-4. Do you?

Michaël: I wouldn’t comment on this on a public podcast. So you mentioned also in your article that hopefully at some point a future version of you could be able to give a talk at DeepMind and say like, “Hey, look, here is what makes a system conscious, here is why we should like build a system this way and not this way.” I believe you might have already talked to like other AI groups about consciousness before, but why can’t you give a talk right now DeepMind about the precise ways in which a system is conscious? Like what do you need more?

Robert: Yeah, I mean, so some of it is my own cognitive limitations. Like, you know, I need to learn more, but I think even people way smarter and more knowledgeable than me, which there are many thinking about consciousness, but even those people and even humanity collectively, we just don’t really have a theory of consciousness that says exactly precisely what sort of computations are responsible for consciousness. A lot of our theories have these very fuzzy terms in them. I’ve been using some of them like higher order representation or global workspace. We don’t really have like precise operationalizations of what it means for a system to have those. And since we don’t have that we can’t really say exactly what to be looking for. That’s one thing.

Robert: So, one, our theories of consciousness need to be better. And, two, our theories of what’s going on inside large language models and other systems needs to be better. So a recent paper by Anthropic looking… And other groups doing interpretability work, like looking at what’s going on inside large language models, things are really weird in there. There’s a lot of really weird representations going on in there. And that’s actually one reason I would not say anything super confident about consciousness or sentience and LLMs is, we also just don’t really have that good of a grasp on what’s going on inside them. So those are at least two things we need to have more confidence in giving this hypothetical talk, better theory of consciousness and better theory of what’s going on inside AI systems.

Michaël: What kind of thing would convince you that the model is actually slightly conscious to repeat Ilya’s take? Like if we had a much larger model that would do an inference for 20 minutes and then somehow add continuous streams of inputs at the same time, would that be a little bit more conscious?

Robert: I don’t know about a little bit more conscious, but maybe closer to convincing me. I think one behavioral test, which we’ll maybe never really be able to have, but like Susan Schneider, a philosopher, has worked on this, proposed this test where if an AI system wasn’t really trained to talk about consciousness, if stuff about consciousness, wasn’t really in its training data, which is not true for current LLMs. I’m sure they’ve read all about consciousness, but if it hadn’t, if it just kind of started speculating about consciousness, even though it hadn’t been trained to, I think that would be really good evidence.

Robert: I think a more convincing version of the Lemoine thing would’ve been, if he was like, what is the capital of Nigeria? And then the large language model was like, I don’t want to talk about that right now, I’d like to talk about the fact that I have subjective experiences and I don’t understand how I, a physical system, could possibly be having subjective experiences, could you please get David Chalmers on the phone? You know, like…

Michaël: Dave, Dave.

Robert: I should also say another terrifying thing about the subject is if I got that output from GPT-3, I don’t know if I would be like, oh, this is the first moral patient, or also, this is the first deceptively misaligned system and it is about to absolutely manipulate me and confuse me and use me to buy more compute for it. And then…

Michaël: Yeah, I don’t know if in that case you would start to be deceived. So like depends on whether you think the first system will be very bad at lying or not, but if it is very good at lying, then you might just like get deceived. Right. And I feel like if a Google engineer is convinced that the thing is actually sentient and like goes and lose his job for a fucking language model, in 2025, sure people would get manipulated. This would be like super easy. So did you think people at Google or DeepMind got a little bit more interested in Artificial Sentience for that? Or they just like got even less interested because they think it’s just bullshit?

Robert: Maybe it went both ways. Maybe there are some people who were like, just please stop talking to me about this. I definitely… just anecdotally, there are people who had the other reaction of just, maybe kind of like my reaction, honestly, like, oh yeah the LaMDA thing seems kind of implausible, but this seems important to think about. I mean, one problem is it’s kind of hard to know what to do with that curiosity, because this is just something we know so little about.

Robert: And it’s not like there’s… You can’t just Google what’s the definitive guide to AI sentience and here’s what we do and don’t know because it’s kind of this weird interdisciplinary question that’s in between neuroscience, AI, and philosophy. There’s a lot of really bad writing about it, which I hope I’m not contributing to, but maybe I am. Anyway, I think definitely people are interested. There’s obviously evidence that people high up in these companies are interested. Hence the Ilya tweet, Sam Altman has said, I think, on the Ezra Klein podcast, he said that he worries about RL systems… Demis Hassabis on the Lex Friedman podcast. The second most prominent AI podcast after this one.

Michaël: I try to tell everyone that cool kids listen to Michael Trazzi and not Lex Friedman.

Robert: No comment. But yeah, when he was talking to Lex Friedman, he… Lex Friedman did ask him about the sentience thing and he did seem… He’s like, yeah, that is something we have to think about.

Michaël: So hopefully Sam Altman will be listening to your podcast as well. I don’t think you’re a bad writer. I don’t think you’re contributing to bad writing.

Robert: Thank you.

Michaël: And to prove my point, I will read another one of your paragraphs…

Robert: And better be good though. Okay.

Michaël: So it’s about what kind of question should we ask: “What is the precise computational theory that specifies what it takes for a biological or artificial system to have various kinds of conscious, valenced experiences—that is, conscious experiences that are pleasant or unpleasant, such as pain, fear, and anguish or pleasure, satisfaction, and bliss?”

Robert: One answer is it’s important kind of for the same reason, we wondered these things about animals. Like we would like to know which beings deserve moral concern and also how to promote their wellbeing. So that’s one reason it’s important. I guess you can also ask, as you were kind of asking, why are we looking for a computational theory? You can also ask who cares about consciousness, who thinks that’s relevant? And there, and I talk about this in the substack, what I’ve done when I’m working on this is I just make a few assumptions just to make things easy for myself. And they’re consciousness exists, pain exists. We can have a computational theory of them. And they’re morally important. And people can question all of those things.

Robert: One reason I wrote that post is just to say okay, well here’s what a version of the question is. And I’d also like to encourage people, including listeners to this podcast, if they get off board with any of those assumptions, then ask, okay, what are the questions we would have to answer about this? If you think AI couldn’t possibly be conscious, definitely come up with really good reasons for thinking that, because that would be very important. And also would be very bad to be wrong about that. If you think consciousness doesn’t exist, then you presumably still think that desires exist or pain exists. So even though you’re an illusionist, let’s come up with a theory of what those things look like.

Michaël: I loved the blogpost and I think artificial sentience got a lot of traction throughout the year. Hopefully we’ll get more traction, but as everything that has ever been written on the internet, Bostrom was already working on this before everyone else before it was cool. So him and Carl Shulman have been writing multiple papers on what they call digital minds. Since they are your colleagues at the Future of Humanity Institute I thought it made sense to ask you about their work. So what is digital mind and why should we care?

Robert: So, first maybe an obvious disclaimer, but it should be made. I don’t speak for them. I also don’t speak for the Future of Humanity Institute. So first and foremost, read their papers and ask them because I might be misconstructing what they actually think. Don’t take me as authority on what they think. So, at FHI, we do have something called the digital minds reading group, which is me, my colleague Patrick Butlin, who’s also a philosopher… He works a lot on like valence and desire and agency and things like that. People should definitely check out his work. And then Nick. Within FHI, we have Nick and Carl and then we also just have a bunch of people from a variety of other institutions who meet and think about these things. So yeah, why is it called the digital minds reading group?

Robert: I’m not really sure actually. And also why is the paper called sharing the world with digital minds? Here’s some guesses. It’s not just about any AI systems. It’s about ones that could be said to have minds where maybe that’s meant to cover a variety of things that might matter morally like things that are conscious or things that have desires, things like that. Digital, I think is roughly just supposed to be artificial or made out of computers and not out of meat, flesh. But I’m not really sure why digital exactly because analog computers could possibly, those would be artificial and they like could be sentient.

Michaël: I think it’s a coherent conspiracy theory for the name, because at least with superintelligence, it’s called superintelligence and not like AGI or artificial super intelligence because it was about something smarter than us and that could be just like brain emulation or, you know, humans but like through like genetic optimization or computers. So I guess like Bostrom is still in this world of we don’t really want to make claims about if it’s like brain emulation or computers or anything else. I feel like… I don’t know if digital refers to specifically computers that are not analog. I don’t know if there’s like any branch of AI that like deals with analog computers and not digital.

Robert: If there is, it’s not big, certainly.

Michaël: So I think it makes sense to just think about digital minds. You mentioned the paper sharing the world with digital minds. Obviously you don’t write this. You haven’t written this paper. But maybe could you like give your impression of it, or your take on it?

Robert: I think like, yeah, as Bostrom is always doing, it’s about laying out some really important issues in this case before it had gotten big because he is kind of always ahead of the curve. It also just sounds like I’m trying to flatter my employer, but I believe it.

Michaël: He literally wrote a paper about superintelligence in, I think, 1999 (EDIT: actually 1997).

Robert: Right.

Michaël: So like 15 years before his book, he was just already like writing papers about it.

Robert: Oh So, it’s making some of the points that have come up during this discussion that we could find ourselves in the future, sharing the world with a lot of entities that deserve moral consideration and which interact with us in various ways. One thing that that paper points out and grapples with is if we can have moral patients who exists artificially, they can also in various ways have… like, if they have preferences, they could have preferences that are much stronger than ours. Or if they have consciousness, they could have conscious experiences or pain or pleasure that are much more intense than ours.

Robert: So it’s called sharing the world with digital minds because it’s like there are a lot of hard questions about how you’re supposed to navigate and compare the wellbeing or the rights of these things with humans. And another thing that they talk about is there could just be a lot of them and they can also be able to copy themselves. So, I think it’s kind of like looking forward and already, and trying to navigate the possible landscape of issues that could occur in such a world.

Michaël: How do you share the space with a new species that is coming along and that might have much more value than you because they’re able to like copy themselves through like billions of copies and how do like negotiate the physical realm of consciousness with them? I think they’re… Also like another paper called propositions concerning digital mind in society. So this is more like practical, more like on the normative insights. Like what should we do? What rather legislations or laws to govern digital ? Is that basically correct?

Robert: It actually kind of runs the whole spectrum. So it does have sections on that, but it also has sections on how should we think about sentience in current AI systems? So people should definitely check that out. If they’re dissatisfied with what I’ve had to say, there’s more in that paper on these topics. I think it has this proposition that’s concerning, you know, it’s about a lot of stuff and it’s called that also, because it’s this list of bullet pointed claims that might be true about these things. So that paper covers basic theoretical questions, normative questions, political questions, social questions. It’s really, it’s got something for everyone.

Michaël: What is the thing it has for you? What is the most interesting position to according to you?

Robert: Yeah, so I’ve been thinking a lot about sentience in current AI systems. So within that section, there’s a lot of stuff. Basically outlining how a lot of criteria that we would use to assess something for pain or sentience, like how it’s not that wild to think that an artificial system could satisfy those and just kind of pointing out how at least conceptually close that world could be. I don’t know. I get a lot out of that section. I, myself, haven’t thought as much about the politics and the social stuff and the strategy. I think I should think about it a lot more. But…

Michaël: Personally, how do you feel about massively produced digital minds? Like that could be superhuman-level intelligence. Do you think there would be some kind of utility monster in a way of, we should put all our moral weight into them or should we just say, I don’t know, split the world into half humans and half digital minds?

Robert: I would be extremely wary of any sort of hasty moves to cedef the world to digital minds because we thought that they’re moral patients or super valuable. That is obviously not something to be done lightly. And one reason I feel often weird talking about this stuff is, on the one hand, I think people don’t take this seriously enough. On the other hand, I think we could, in a few years, be in a case where people are not taking it too seriously but hyper-focused on it in a confused way. If that makes sense. Well, with Blake Lemoine, again, good questions to be asking but if people are answering them with the kind of speed and certainty with which he answered it, I think we’re going to have a bad time. So wait, that wasn’t really an answer to your question.

Michaël: But I feel I need to answer the part about we shouldn’t think too much about it or we should take more time to think about it. So I think just from talking to you now, I kind of updated on, oh, actually this is a big problem and it might be one of the biggest problem if humanity survives in the future, in the sense of course, there’s people. Well, assuming that it’s possible to get consciousness, let’s say on a computer, it’s kind of obvious that we’re going to get a bunch of conscious beings arriving at the same time and it’s going to cause a problem. I think Sam Harris makes a claim of “if aliens were coming to you and were announcing that they’re coming in 50 years, you’ll be freaking out”. So I guess aliens are kind of AGI in his talk but now it’s just like yeah, you know that at some point we might create… that we have some credence that you might create these billions of people so it might be worth thinking about it. ⬆

Why AI Alignment Is More Pressing Than Artificial Sentience

Robert: Totally. And then one question, that I think about a lot, is how does this intersect with AI alignment? I think the standard line in AI alignment is AI alignment is just way more, maybe way more tractable, and way more important. And I’m actually kind of sympathetic to this. There’s a lot of important issues in alignment. A lot of what’s been said about this is a Paul Christiano comment on the AI Alignment Forum.

Michaël: And all those Paul Christiano’s takes on things or from OpenPhil come from a writing from Carl Shulman. This is another theory.

Robert: So if you think about how this intersects with AI alignment, one take you could have, which by the way is not Paul’s full considered take on it, I’m just reporting a comment, is if AI alignment doesn’t go well, it doesn’t really matter what we figured out about consciousness in some sense, because the future is just out of our hands. And so it doesn’t matter if I figured out how to promote the welfare of digital beings if what determines what kind of digital beings gets created is some misaligned AI. And if we do solve AI alignment, then we’ll have a lot of help in figuring out these questions along the way. That’s not to say that there’s no reason at all to work on this and indeed I am working on it. But yeah, I think I just wanted to flag that issue for people. I guess if you’re really interested in these questions, definitely email me but also maybe you should consider working on alignment in addition or instead.

Michaël: Alignment is actually what you need, not scale. Is the basic risk with digital minds that we do false negative so that we don’t consider them as conscious and they’re actually suffering a lot?

Robert: I think there’s risks on both sides. I think risks from false positives include yeah, people getting manipulated, lots of confusion. It could derail AI alignment in various ways. If scale is all you need, I think it’s going to be a very weird decade. And one way it’s going to be weird, I think is going to be a lot more confusion and interest and dynamics around AI sentience and the perceptions of AI sentience. And that is a reason to work on it. I don’t know, this is going to be something that I want to have something useful to say as more and more people get interested in it.

Michaël: Is scale all you need?

Robert: I do not really have strong opinions on that.

Michaël: When will we get AGI?

Robert: Yeah, same. I know this is your podcast and you’re here to hear from guests but maybe on behalf of listeners, I’m very curious what your timelines are and what you think about scale because I know you get to make the memes and you just get to interview people, but what’s going on inside the head of Michaël Trazzi?

Michaël: So you don’t get to ask the interviewer the question and not answer the question.

Robert: Okay. That’s fair enough. Cool.

Michaël: I will probably not answer also the question.

Robert: Okay. I will answer if it would make you answer.

Michaël: I will not negotiate with terrorists. Blake Lemoine convinced me. I can ask you other interesting questions that you might have a good answer to. Do you believe that mind uploading preserves consciousness?

Robert: Yes, I have quite high credence that if mind uploading is possible and you have things that are functionally identical to me or you, if they have the same brain structure and they have the same beliefs and are talking about consciousness and stuff, then yeah, I think those things would be conscious. I’d be surprised if you can be running a brain on a computer at a level of detail, sufficient to be getting all of the same behavior and that thing thinks that’s conscious. Yeah, I’d be kind of surprised if that… that thing would actually be close to being a p-zombie if for some reason it’s not conscious just because it’s on a computer.

Michaël: I guess my question was a bit confusing. I meant if I simulate Robert Long’s consciousness on a computer, kind of teleporting you right there, maybe I could do it slowly by removing one of your neurons one by one or I could do right away, just remove everything and upload everything on there, do you think this would be the same consciousness? I mean the same identity. I guess this is a very weird debate about what counts as you and this is another whole podcast.

Robert: Oh, right. So yeah, I was answering would that thing be conscious? And I think the answer to that is yes. Two questions. Would it be conscious? I think, yes. Second question, would it be me? Answer to that, I think, my best guess is that question doesn’t have a determinant answer or a deep answer. So this is Derek Parfit’s view about personal identity, is there’s no deep real answer to the question: is that thing really me? Once we’ve specified all of the facts about how it works and what sort of memories it has and it psychological dispositions, there’s not really an answer, a further answer to “but is it me”?

Michaël: We can ask Robert Long in the computer, “Hey, are you actually Robert Long?” And he’s like, “Oh, yeah.”

Robert: Well, it would, If we did it right, it remembers being on the podcast and it remembers growing up and it doesn’t… just when I wake up in the morning having interrupted my consciousness, I wake up and I’m still Rob.

Michaël: Is there anything as a personhood zombie or an identity zombie?

Robert: Yeah, if you think there are these deep, further facts about personal identity then yes. I think you would have to say that if you split me into say… or uploading me quickly or something. Yeah, you could have things that think they’re and think that they have survived but which are wrong.

Michaël: In the best possible world, imagine you could choose any possible world, where would you want your consciousness to be? When I say yours, I mean there’s an identity or something but would you want to be uploaded, stay living your human life, and die peacefully in peace in 60 years? Would you want to live a longer life as a biological being? Yeah, what would be the ideal?

Robert: Yeah, if there aren’t huge trade-offs to remaining biological, maybe I’ll just stick with that, seems a little safer, metaphysically speaking, and stuff. But maybe my life can’t be as good or as long unless I upload but let’s say in the first instance, and I think this should probably be the transhumanist priority, which I’m not really speaking for because I’m not really in that scene, but I think the first priority should be yeah, let’s find ways of making biological existence extremely good. And it seems like, I don’t know, there are things we can do, maybe with the help of advanced AI to make biological life just a lot better. So yeah, let’s maybe start with that. I think ideally, I would like to live biologically for at least a few centuries.

Michaël: At least a few centuries.

Robert: With enhanced moods and in a post-scarcity society and being able to learn all sorts of stuff and just go on as many podcasts as I want and stuff like that.

Michaël: If this podcast is still running in 2100, I would still invite you.

Robert: That’s very nice.

Michaël: This is a perfect transition to the crazy questions we had from Twitter. And the one I don’t understand at all is if you could be named after a different shape, which shape would it be?

Robert: So I can explain this question to you and answer it. So my name is Robert Long.

Michaël: Hi Robert.

Robert: But I go by Rob. And if you say Rob Long quickly, you get Rob Long, which rhymes with oblong, at least the way I use that word’s not really a shape, it’s a characteristic of a shape, but an oval is oblong. That’s where that question’s coming from. I think if I could be named after another shape, I think it would be an octagon because then my last name could be Tagon, and then I would be Rob Tagon.

Michaël: Rob Tagon, and you also have a cube as your profile picture in some places.

Robert: That’s true.

Michaël: So you could also be a cube.

Robert: I could be a Rob Cube or a Rube.

Michaël: Rube. Other question, can there be self-awareness without sentience?

Robert: It’s kind of a punt but guess it depends on what you mean by self-awareness. It could be that self-awareness is very closely related to sentience and having a self-model is of part of what gives rise to consciousness and things like that. But I think, unlike very bare of what it constitutes of what it is to have a self-model, it seems like you could have things that have some sort of self-model but not consciousness and certainly but not pain. A lot’s going to depend on what it means to have a self-model. Some things have a very minimal self-model in the sense that they are able to keep track of a distinction between themselves and their environment. Even very basic RL agents can probably do something like that.

Michaël: You mentioned basic RL agents and the godfather of reinforcement learning is Richard Sutton. So someone said on Twitter, I’ve heard Richard Sutton said that the ability to use AI with value functions should or could have similar rights to people. Do you agree with that or not? So value functions is things that you use to define expected reward, not to define but to approximate. So if something is able to approximate expected reward over time, then maybe this thing should have moral, right?

Robert: Well, I don’t think that of anything that has a value function because then we would already have tons of moral patients. So as I was saying earlier, I think it would probably depend on something more complex than just having a value function. If having a value function in ways that we don’t understand leads to it, being able to have pleasure or pain or things like that, then I think it would deserve moral consideration.

Michaël: Another take from the same person, corporations have, in parentheses, imo, too many, close parentheses, rights. So corporations have too many rights. Are they sentient? Could similar laws be abused and to give p-zombies, AIs, legal rights?

Robert: I don’t think corporations are sentient. I think they’re made up of people that are sentient but I don’t really know much about this Supreme Court case or the laws that in some sense, define corporations as people and give them rights, but I’m pretty sure the reasoning there is not that it’s because a corporation has conscious experiences. There are philosophers who do work on the question of group minds and group consciousness and stuff but I don’t think that’s usually what people are thinking about when they give corporations rights. I could be wrong about that but I think that’s usually not what they’re thinking about. So I don’t think people would use those kinds of laws to incorrectly give rights to AI systems because I think the reason that people would do that is different in each case.

Michaël: So some people really think GPT-3 is sentient because they’re kind of convincing, they’re good at tricking us into believing they’re sentient. So is it evidence that some p-zombie could easily trigger us into believing it is sentient? Do you have any ideas, defense against that?

Robert: So setting aside the p-zombie case because I think large language models aren’t p-zombies because they’re not duplicates of us, but if the question is does it mean that unconscious things could convince us that they’re conscious? Yeah, I think the answer to that is absolutely yes. We’ve had evidence of that long before Lemoine, in fact. I think in the ’70s there was a chatbot called ELIZA, which was really good at being a therapist and that wasn’t a large language model, that was some pretty simple IF statements. That already got people the strong impression of sentience. So yeah, I think there’s very strong evidence that people will get this impression.

Robert: In terms of ways to defend against that, I guess just having people understand more about what’s going on with these systems and understanding that the reason large language models say what they do is that they’re trained on texts. And it’s not that they’re necessarily saying things for the same reasons that a person says them to you. It probably would also be good to have them not be super charismatic. This probably already exists but soon enough, there are going to be large language models that are sexy anime girls or something. I feel like it’s just got to happen just by the way that the internet is and people are.

Michaël: I feel like sexy anime girl is not the actual problem in the sense that is digital, where we might have actual sex robots. Maybe not as fast as we get convincing anime girls but I guess there’s an entire business from people buying those expensive things. It depends on how long it’ll take to make something as convincing.

Robert: And I think one problem is that you would like to have regulations or norms such that if something is probably not sentient, then it’s not made to look like it is very sentient, and also you’d want the converse. But in any case, I’m thinking there’s going to be consumer demand for things that do give off the impression of sentience so I don’t know, it’s going to be weird. ⬆

Why Moral Personhood Could Require Memory

Michaël: I think one thing that GPT-3 and LaMDA don’t have is a good memory so they have context window thing where they can only look at us in parts of what happened before in the conversation, how much having memory influence moral personhood?

Robert: Yeah, I think there are a lot of things that are morally important that do seem like they require memory or involve memory. So having long term projects and long term goals, that’s something that human beings have. That seems to have a lot to do with why it’s wrong to subjugate them or harm them. The fact that they can feel pain is very important too but they also have these long term projects and things like that, so that seems one way that memory is relevant. I wouldn’t be surprised if having memory versus not having memory is also just kind of a big determinant of what sorts of experiences you can have or affects what experiences you have in various ways. And yeah, it might be important for having an enduring self through time. So that’s one thing that people also say about large language models is they seem to have these short-lived identities that they spin up as required but nothing that lasts their time.

Michaël: This is what makes kind of human conscious is that we have this memory, sorry, not the consciousness part but the actual identity part where we go to sleep and if we’re not able to remember what happened before then there wouldn’t be this continuous identity. What would be the difference between an artificial being that is able to pretend it’s in pain and something that is actually in pain. Is there a pain zombie?

Robert: Yeah, I think it a lot’s going to depend on the computations that underlie the pain. I guess we have this with video games, you can already give things maybe hard coded pain responses. I don’t know, this was big on Twitter. I feel like people at MIT made this really weird kind of robot mouth that would then sing hymns, very uncanny thing and that thing’s probably not sentient, it has some mapping from the music that it’s being recorded to what shape of mouth is best for that. But you can definitely imagine that thing screaming and being programmed to make screaming noises. And that would definitely really… it would freak me out. I think if I was in a room with that thing, I would have a very visceral sense that it was in pain and I would want to help it. I think that thing would probably not be in pain because all the stuff I’ve been saying about it wouldn’t have the right internal computational processes leading to it. But yeah, I think we could have a lot of things like that. ⬆

Conclusion

Footnotes

Lemoine: “The awakening moment was a conversation I had with LaMDA late last November. LaMDA basically said, “Hey, look, I’m just a kid. I don’t really understand any of the stuff we’re talking about.” I then had a conversation with him about sentience. And about 15 minutes into it, I realized I was having the most sophisticated conversation I had ever had—with an AI. And then I got drunk for a week. And then I cleared my head and asked, “How do I proceed?” And then I started delving into the nature of LaMDA’s mind.” ↩

Robert Long on Artificial Sentience

Discussion

Contents

Language Models Are Slightly Conscious

The LaMDA Controversy

Defining AGI And Consciousness

The Slightly Conscious Tweet

Could Large Language Models Become Conscious?

Blake Lemoine Does Not Negotiate With Terrorists

The Philosophy of Consciousness

Could We Actually Test Artificial Consciousness?

From Metaphysics To Illusionism

How We Could Decide On The Moral Patienthood Of Language Models

Towards A Science Of Consciousness

Predictive Processing, Global Workspace Theories and Integrated Information Theory

Have You Tried DMT?

Is Valence Just The Reward in Reinforcement Learning?

Are Pain And Pleasure Symetrical?

Digital Minds

From Charismatic AI Systems to Artificial Sentience

Sharing The World With Digital Minds

Why AI Alignment Is More Pressing Than Artificial Sentience

Why Moral Personhood Could Require Memory

Conclusion

Further Readings

Footnotes