The discussion about existential risk from superintelligent AI is back, seemingly awakened by the recent dramatic progress in large language models such as GPT-4. The basic argument goes something like this: at some point, some AI system will be smarter than any human, and because it is smarter than its human creators it will be able to improve itself to be even smarter. It will then proceed to take over the world, and because it doesn't really care for us it might just exterminate all humans along the way. Oops.
Now I want you to consider the following proposal: Elden Ring, the video game, is an equally serious existential threat to humanity. Elden Ring is the best video game of 2022, according to me and many others. As such, millions of people have it installed on their computers or game consoles. It's a massive piece of software, around 50 gigabytes, and it's certainly complex enough that nobody understands entirely how it works. (Video games have become exponentially larger and more complex over time.) By default it has read and write access to your hard drive and can communicate with the internet; in fact, the game prominently features messages left between players and players "invading" each other. The game is chock-full of violence, and it seems to want to punish its players (it even makes us enjoy being punished by it). Some of the game's main themes are civilizational collapse and vengeful deities. Would it not be reasonable to be worried that this game would take over the world, maybe spreading from computer to computer and improving itself, and then killing all humans? Many of the game's characters would be perfectly happy to kill all humans, often for obscure reasons.
But if you believe in some version of the AI existential risk argument, why is your argument not then also ridiculous? Why can we laugh at the idea that Elden Ring will destroy us all, but should seriously consider that some other software - perhaps some distant relative of GPT-4, Stable Diffusion, or AlphaGo - might wipe us all out?
The intuitive response to this is that Elden Ring is "not AI". GPT-4, Stable Diffusion, and AlphaGo are all "AI". Therefore they are more dangerous. But "AI" is just the name for a field of researchers and the various algorithms they invent and papers and software they publish. We call the field AI because of a workshop in 1956, and because it's good PR. AI is not a thing, or a method, or even a unified body of knowledge. AI researchers that work on different methods or subfields might barely understand each other, making for awkward hallway conversations. If you want to be charitable, you could say that many - but not all - of the impressive AI systems in the last ten years are built around gradient descent. But gradient descent itself is just high-school mathematics that has been known for hundreds of years. The devil is really in the details here, and there are lots and lots of details. GPT-4, Stable Diffusion, and AlphaGo do not have much in common beyond the use of gradient descent. So saying that something is scary because it's "AI" says almost nothing.
(This is honestly a little bit hard to admit for AI researchers, because many of us entered the field because we wanted to create this mystical thing called artificial intelligence, but then we spend our careers largely hammering away at various details and niche applications. AI is a powerful motivating ideology. But I think it's time we confess to the mundane nature of what we actually do.)
Another potential response is that what we should be worried about systems that have goals, can modify themselves, and spread over the internet. But this is not true of any existing AI systems that I know of, at least not in any way that would not be true about Elden Ring. (Computer viruses can spread over the internet and modify themselves, but they have been around since the 1980s and nobody seems to worry very much about them.)
Here is where we must concede that we are not worried about any existing systems, but rather about future systems that are "intelligent" or even "generally intelligent". This would set them apart from Elden Ring, and arguably also from existing AI systems. A generally intelligent system could learn to improve itself, fool humans to let it out onto the internet, and then it would kill all humans because, well, that's the cool thing to do.
See what's happening here? We introduce the word "intelligence" and suddenly a whole lot of things follow.
But it's not clear that "intelligence" is a useful abstraction here. Ok, this an excessively diplomatic phrasing. What I meant to say is that intelligence is a weasel word that is interfering with our ability to reason about these matters. It seems to evoke a kind of mystic aura, where if someone/something is "intelligent" it is seen to have a whole lot of capabilities that we not have evidence for.
Intelligence can be usefully spoken about as something that pops up when we do a factor analysis of various cognitive tests, which we can measure with some reliability and which has correlations with e.g. performance at certain jobs and life expectancy (especially in the military). This is arguably (but weakly) related to how we use the same word to say things like "Alice is more intelligent than Bob" when we me mean that she says more clever things than he does. But outside a rather narrow human context, the word is ill-defined and ill-behaved.
This is perhaps seen most easily by comparing us humans with other denizens of our planet. We're smarter than the other animals, right? Turns out you can't even test this proposition in a fair and systematic view. It's true that we seem to be unmatched in our ability to express ourselves in compositional language. But certain corvids seem to outperform us in long-term location memory, chimps outperform us in some short-term memory tasks, many species outperform us for face recognition among their own species, and there are animals that outperform us for most sensory processing tasks that are not vision-based. And let's not even get started with comparing our motor skills with those of octopuses. The cognitive capacities of animals are best understood as scrappy adaptations for particular ecological niches, and the same goes for humans. There's no good reason to suppose that our intelligence should be overall superior or excessively general. Especially compared to other animals that live in a variety of environments, like rats or pigeons.
We can also try to imagine what intelligence significantly "higher" than a human would mean. Except... we can't, really. Think of the smartest human you know, and speed that person up so they think ten times faster, and give them ten times greater long-term memory. To the extent this thought experiment makes sense, we would have someone who would ace an IQ test and probably be a very good programmer. But it's not clear that there is anything qualitatively different there. Nothing that would permit this hypothetical person to e.g. take over the world and kill all humans. That's not how society works. (Think about the most powerful people on earth and whether they are also those that would score highest on an IQ test.)
It could also be pointed out that we already have computer software that outperforms us by far on various cognitive tasks, including calculating, counting, searching databases and various forms of text manipulation. In fact, we have had such software for many decades. That's why computers are so popular. Why do we not worry that calculating software will take over the world? In fact, back in 1950s, when computers were new, the ability to do basic symbol manipulation was called "intelligence" and people actually did worry that such machines might supersede humans. Turing himself was part of the debate, gently mocking those who believed that the computers would take over the world. These days, we've stopped worrying because we no longer think of simple calculation as "intelligence". Nobody worries that Excel will take over the world. Maybe because Excel actually has taken over the world by being installed on billions of computers, and that's fine with us.
Ergo, I believe that "intelligence" is a rather arbitrary collection of capabilities that has some predictive value for humans, but that the concept is largely meaningless outside of this very narrow context. Because of the inherent ambiguity of this concept, using it an argument is liable to derail that argument. Many of the arguments for why "AI" poses an existential risk are of the form: This system exhibits property A, and we think that property B might lead to danger for humanity; for brevity, we'll call both A and B "intelligence".
If we ban the concepts "intelligence" and "artificial intelligence" (and near-synonyms like "cognitive powers"), the doomer argument (some technical system will self-improve and kill us all) becomes much harder to state. Because then, you have to get concrete about what kind of system would have these marvelous abilities and where they would come from. Which systems can self-improve, how, and how much? What does improvement mean here? Which systems can trick humans do what they want, and how do they get there? Which systems even "want" anything at all? Which systems could take over the world, how do they get that knowledge, and how is our society constructed so as to be so easily destroyed? The onus is on the person proposing a doomer argument to actually spell this out, without resorting to treacherous conceptual shortcuts. Yes, this is hard work, but extraordinary claims require extraordinary evidence.
Once you start investigating which systems have a trace of these abilities, you may find them almost completely lacking in systems that are called "AI". You could rig an LLM to train on its own output and in some sense "self-improve", but it's very unclear how far this improvement would take it and if it helps the LLM get better at anything to worry about. Meanwhile, regular computer viruses have been able to randomize parts of themselves to avoid detection for a long time now. You could claim that AlphaGo in some sense has an objective, but it's objective is very constrained and far from the real world (to win at Go). Meanwhile, how about whatever giant scheduling system FedEx or UPS uses? And you could worry about Bing or ChatGPT occasionally suggesting violence, but what about Elden Ring, which is full of violence and talk of the end of the world?
I have yet to see a doomer/x-risk argument that is even remotely persuasive, as they all tend to dissolve once you remove the fuzzy and ambiguous abstractions (AI, intelligence, cognitive powers etc) that they rely on. I highly doubt such an argument can be made while referring only to concrete capabilities observed in actual software. One could perhaps make a logically coherent doomer argument by simply positing various properties of a hypothetical superintelligent entity. (This is similar to ontological arguments for the existence of god.) But this hypothetical entity would have nothing in common with software that actually exists and may not be realizable in the real world. It would be about equally far from existing "AI" as from Excel or Elden Ring.
This does not mean that we should not investigate the effects various new technologies have on society. LLMs like GPT-4 are quite amazing, and will likely affect most of us in many ways; maybe multimodal models will be at the core of complex software system in the future, adding layers of useful functionality to everything. It may also require us to find new societal and psychological mechanisms to deal with impersonated identities, insidious biases, and widespread machine bullshitting. These are important tasks and a crucial conversation to have, but the doomer discourse is unfortunately sucking much of the oxygen out of the room at the moment and risks tainting serious discussion about societal impact of this exciting new technology.
In the meantime, if you need some doom and gloom, I recommend playing Elden Ring. It really is an exceptional game. You'll get all the punishment you need and deserve as you die again and again at the hands/claws/tentacles of morbid monstrosities. The sense of apocalypse is ubiquitous, and the deranged utterances of seers, demigods, and cultists will satisfy your cravings for psychological darkness. By all means, allow yourself to sink into this comfortable and highly enjoyable nightmare for a while. Just remember that Morgott and Malenia will not kill you in real life. It is all a game, and you can turn it off when you want to.
I agree that intelligence is not some scalar quantity - different organisms' brains adapt to their environments differently - but it seems unfair to say that you cannot say at all that we're smarter than other species. Humans have left footprints on the moon, toyed with DNA itself, and have unleashed a greater destructive force on the planet than any other species. One of these things is not like the others! Somehow, we've exerted causal influence into outer space with nothing but a set of soft, mushy hands with opposable thumbs. What do we have that other animals don't? And whatever capabilities enable this, can we just call *that* intelligence (or give it a different name and then use that). Whatever that capability is, is it scalable in a way that should be concerning?
ReplyDeleteWhatever this magical human capability is, it has enabled us to enhance ourselves to do the very things that you say other species can do that we can't: we have perfect memory (via paper and computers), we have a wide sensory range and the ability to use it in a variety of ways (via a wide range of sensory technologies), we have developed machine learning models that (if they haven't already) are likely to match or even outperform other species' ability to do various perceptual tasks. It clearly does seem like human intelligence is, in some sense, "more general" than other animals. Even for things for which we do not have innate capabilities, we can still outperform other species, just with more steps. This is consistent with what generalisation looks like in other areas, e.g., if you know the formal definition of a derivative, then you can calculate all derivatives, even if it takes you more steps than someone who knows a few rules for differentiation by heart.
Humans can think about many things that literally cannot be mapped into a dog brain. We even have representations of things not directly perceivable to us, albeit at a higher level. For example, we can't hear ultrasonic tones like a dog, but that hasn't stopped us from hypothesising that these tones exist, and then developing technologies that can measure these tones and then forming representations on the output of these sensors that map to the same representations that would occur in a dog's brain. It maybe shouldn't be surprising that we might have trouble conceiving of what a superhuman intelligence would be like, if it was possible. But this is annoying, because it makes it hard to discuss the topic at all. The best it seems we can do is speculate whether a similar relationship between human and ape could occur between AI and human, and we can only make very broad predictions in this way like "human does thing very surprising to ape because ape didn't realise it was possible" or "human is able to exert its will over ape despite being physically weak by using a superior ability to navigate the web of causality of the universe" and so on.
As for what the magical human capability is, maybe this needs to be studied further. Our world model is clearly much better than that of an ape, and we are able to use it to achieve very complex goals, even overcoming our innate physical, sensory and even cognitive deficiencies that we may have compared to other species. We have a much better understanding of the world, with more abstract representations that enable better prediction, and are able to use our model to choose actions that enable us to do things that literally cannot map into the brains of other species. It seems that, with some effort, you could come up with a reasonable definition of whatever this capability is that is probably more in line with what AI doomers are thinking (explicitly or intuitively). I'm not sure whether this capability scales much in a way that justifies AI doomer concerns though. I think if it can at least scale far in the direction of science and engineering prowess, then that could be quite concerning.
Let me suggest a simple way from GPT4 to a capable agent with goals. You use RL to train LLM to write code in python, you execute it, and then give the output back to LLM in an infinite loop. If this system is connected to a Boston Robotics ~somthing~ and the internet and an AWS account with lost of money, it can effect the real world, and it would be able to self improve by building better LLMs. I think GPT4-5-n aren't close enough to be a real threat, and I agree with this part of your argument.
ReplyDeleteFrom NIH: "The human brain is about three times as big as the brain of our closest living relative, the chimpanzee. Moreover, a part of the brain called the cerebral cortex - which plays a key role in memory, attention, awareness and thought - contains twice as many cells in humans as the same region in chimpanzees. Networks of brain cells in the cerebral cortex also behave differently in the two species."
ReplyDeleteIf you run a chimpanzee with 10x better memory at 10x the speed, do you get a human? Clearly not. So I'm not sure that a human with 10x better memory at 10x the speed is a good metaphor for what might eventually be possible. We simply don't know enough about what it is about the human brain that enables humans to run circles around chimpanzees, and whether this can be scaled in a way that would enable intelligent agents to run circles around us.
I don't think it's debatable that the capability of future systems could have existential implications if pointed in the wrong direction. AI probably won't even have to self-improve to get to this point - we'll probably do the hard part for it anyway. P(doom) all depends on how hard it really is to align such systems, and the case study of evolution doesn't give me a lot of confidence that our current methods will scale well to grounded agents with broadly superhuman capabilities. If a hill climbing process like evolution couldn't make us robustly care about inclusive genetic fitness, can we really trust gradient descent to make superhuman agents robustly care about our goals? Who knows what the reward models currently used in RLHF are really learning and whether blindly hill-climbing on them for very capable grounded agents is a safe thing to do (see Goodhart's law, distributional shift, inner alignment problem).
Intelligence can be defined as using one's knowledge and understanding to adapt to new situations.I don't have an example of a single thing that humans can do that AI won't be able to do (better) eventually. Humans are dangerous, history can attest to that. AIs more potent than us, will be even more so. Potent does not mean they will actually do harm, as most of us arguably do little harm in our lifetimes.
ReplyDeleteThe nuclear bomb analogy seems appropriate to me. A nuclear bomb is very potent, it could destroy us all. But with enough care it arguably saves life.
LLM's can code novel proteins.
ReplyDeleteAnyone can buy DNA/RNA synthesizers.
Humans are made of protein code.
That's a risk vector from bad actors to A.I. agents alike.
Ah yes, videos games, which are almost entirely scripted programs with direct instructions that cannot be got around VS structures for which we currently have no good interpretability tools, and that can do harm very fast given bad goals and means to act. Same thing.
ReplyDeleteHere's a crack at an operationalized concept of "something that is dangerously intelligent": An optimizer that attempts to maximize some abstract function over states of some subset of the world by 1) using input from the world to build an internal representative of the world, and 2) using that model to predict objective function values based on outputs, and 3) choosing outputs that maximize/minimize the objective function. The "enough" in "intelligent enough to be dangerous" is a function of how powerful the program's world model is in terms of his much complexity of the world it can represent and how efficiently and quickly it can update that model, plus how quickly and efficiently it can search through output space to find maximizing values.
ReplyDeleteI *think* the Yudkowskian doomer argument works just fine with that "intelligence"-tabooed definition.
Your comparison makes absolutely no sense. Can Elden Ring learn things infinitely and use what it has learned to make up new ideas? Can Elden Ring write its own computer code? Those are the elements that make AI dangerous, and Elden Ring doesn't have those elements, so all your article does is demonstrate that you don't understand AI.
ReplyDeleteMy main arguement is that as hard as intelligence is to properly define, it is still a factor. An AI that's "intelligent" in the sense of "it knows how to make paperclips and can get better at it" could become a threat to humanity simply by happening to develop a strategy for paperclip production that harms humans as a byproduct. A writing algorithm might realize that the angrier a reader gets the more they read the thing, not realizing that the strategy it's using is called "mass spread of misinformation". An AI isn't necessarily going to just decide to hurt humanity, but there's a myriad ways for one to do so by accident.
ReplyDeleteAGI isn't a super intelligence. It's a single algorithm that can be applied to any problem. It's not necessarily better than human intelligence or even as good as human intelligence. What it is, is a solution for any problem with unknown capabilities and limits. The problem, then, is not that the AI will get smarter than us as a whole and kill us all because it decides it doesn't want humans around anymore. The problem is that it will be pointed at a specific goal, realize that humans impede it from completing its goal, and solve the equation by removing the humans. People aren't worried about artificial human intelligence. People are worried about inhuman intelligence that may not care about human existence.
ReplyDeleteThe Chomskian debate is usefull here too. Indeed AI is not anything like Human Inteligence, the way Chat GPT aproaches learning amd a human brain aproaches lenguage is like how a plane aproaches flight and a bird aproaches flight. So from there the argument that AI can grow to outsmart humanity does not follow. AI cant build conciousness, there is a very old work by John Searl saying that a "strong AI" (an artificial human) is impossible
ReplyDeleteElden Ring isn't replacing people's jobs. Corporations are already looking to replace people with AI powered utilities. This isn't to say that we shouldn't necessarily remove people from those jobs but the fact of the matter is that we have no plans for ensuring that the profits derived from AI powered job replacement will go to anyone except for the people that run those corporations. If we lose 40% of our jobs, that is huge. How do you ensure that people will be able to take care of themselves?
ReplyDeleteThis is a false equivalence and a bad faith argument.