Saturday, April 25, 2026

Complementary Intelligence

 In the following, I will try sketch a way of thinking about human intelligence and human nature through emphasizing its difference from the various methods and systems we call artificial intelligence. This is rooted in my strong belief that talking about a single-dimensional “intelligence” that one can have more or less of, and the obvious extension to asking whether humans or machines are “more intelligent”, is actively harmful for understanding both human and machine intelligence. What matters is the qualitative difference. Between humans and machines, but also between different approaches to AI. We can even think of the different approaches in a geometric framework, with the implication that any type of intelligence must have a direction as well as a magnitude. This perspective, which I call Complementary Intelligence, also suggests a positive research program, that seeks to find the types of intelligence that we do not currently have but that could be interesting to humans, rather than simply imitating human intelligence.

Ok, this was a lot. Let’s rewind the tape, and start by looking at the history of artificial intelligence and the ways we think about it.


You can understand the human mind by looking at the history of our failures at modeling it. For a very long time, we have tried to make machines in our image. After we invented the digital computer, this development sped up and we called it Artificial Intelligence. During the last 70 years or so, the AI research community invented a number of clever ways to make computers do things that so far only humans could. Usually, we set ourselves some problem – play Chess, prove mathematical theorems, translate from Russian to English, or something like that - that humans were good at. Then, we came up with some way of making a computer perform the task. Success!

But when we look closer, we find that the way the computer does the task is typically quite different from how humans do it. The computer may be much better than humans in some ways, and much worse in other ways, and in general just different. So we conclude that we didn’t really achieve “real” artificial intelligence after all. Maybe we were trying to solve the wrong task? So we find another task to solve, and another way of making the computer solve it, and try again. As a result, the study of artificial intelligence has contributed a wide range of technologies, many of which are crucial to our technological civilization. In our urge to talk about AI as a single thing, it is often underappreciated how many these technologies are, and also how different they are. Path finding, object-oriented programming, and optical character recognition are quite different things, but they are all outcomes of AI research. They are also in use in myriads of places and the world would grind to a halt if they disappeared.

Another result of this process is that we know more about what we are not. Every time we realize that the successful solution we have built is fundamentally different than us, we learn something about ourselves. We learn that we are not like that technology, or not only like that technology. However good that machine is at identifying traffic signs or playing Pac-Man, it does so in a fundamentally different way than we do it.

In a sense, this is just a continuation of our long history of using the defining technology of whatever age we are in as a metaphor or lens for understanding ourselves. Descartes, living in an age recently transformed by the mechanical measurement of time, thought of animals and humans as being like clockworks. Freud thought of our drives as producing something like pneumatic pressure requiring outlets, much like the steam engines that pulled trains and powered factories. The telephone switchboard was a popular metaphor in early 20th century neuroscience. But of course, if you try to actually build a mind that functions like a clockwork, a steam engine, or a telephone switchboard, you rapidly realize that there’s a lot missing. And from the incompleteness of the metaphor, you conclude that we are much more, and quite different.

Let us therefore try to sketch a history of AI focusing on not only its successes, but its complement: what we have learned about what we are not. This will by necessity be a very potted history.

The earliest successes of what is now known as AI were based on planning. This includes early Chess and Checkers players, and automated theorem provers such as Logic Theorist. The basic idea is to start at some state (such as a board position in Chess, or an axiom from which you want to derive a theorem) and consider the various possible actions available from there (moves in Chess, transformations in theorem proving). As considering all possible consequences of all possible actions recursively becomes computationally intractable for all but trivial problems, much of the art of planning is in the heuristics for which actions to consider at each point. Already in the 1950s we had theorem proving systems that could rediscover some previously discovered theorems much faster than humans, and in the next decades we saw major successes. In 1996 the Robbins conjecture, a long-open problem, was proven by a search-based theorem prover. Similarly, planning approaches led to superhuman play in classic board games such as Checkers and Chess.

In particular for board games, there has been quite a bit of work comparing humans to planning algorithms for game play. It turns out we are not alike. The algorithms explore many, many more potential moves and board states. Humans tend to explore just a few move sequences, but be much better at evaluating the positions.

In the 1970s and 1980s, expert systems were one of the main foci of AI research. The idea was to encode the knowledge of human experts in a form amenable to logical reasoning, and then let the computer do the reasoning rather than the human expert. Alas, it was not so easy.

Extracting the requisite knowledge from human experts turned out to be an enormous time sink. Humans, experts or not, seemed to have a hard time expressing what they knew. In particular, they found it very hard to express procedural knowledge (how to do things) in a way that could be formalized as rules. The finished systems often turned out to be brittle and inflexible, needing humans to look through decisions, which obviously severely limits the usefulness of the system. An often-used example is MYCIN, which was developed at Stanford in the early 70s to diagnose blood diseases. It took years to encode the 500 or so rules that the system used, and despite good performance in trials, MYCIN was never used in clinical practice.

The most reasonable explanation for the limited success of expert systems is that humans do not store their knowledge as a set of logical statements and rules. This might seem like a pretty obvious thing. Did anyone ever think that our brains operated this way? Surprisingly, yes. A long history of thought–ever since Aristotle–has postulated logics not just as a normative ideal about how we should think, but as a theory of how we actually think. More explicitly, the computer metaphor of the mind that become popular with the rise of cognitive psychology and the advent of cognitive science explicitly compares the functioning of the human mind to a standard computer, von Neumann architecture and all. The best interpretation of the relative failure of classic expert systems is that the computer metaphor cannot literally be true, at least at the level of how we encode knowledge.

Neural networks in one form or another underlie almost all modern AI. That much is generally known. Less often mentioned is that the earliest computer models of neural networks were proposed back in the 1940s, and the backpropagation algorithm that is the direct predecessor of the optimizers used in modern deep learning was invented in the 1970s. While there have been numerous minor and medium-size inventions in neural networks since, the remarkable success of the neural network approach is to a large extent due to us having more data and more compute so we can train larger networks. 

Much has been said about how neural networks mimic some features of human learning, such as learning hierarchies of representations. Less is said about how profoundly different they are to human brains. To begin with, there is no evidence of anything like backpropagation going on in the brain. This is reflected in how differently neural networks learn. In most settings, a neural network must see many more training examples than a human to learn the same concept. And when a concept is learned, it seems to be brittle. For any given model, it seems to be possible to find “attacks”, where changing a few tiny elements of the image completely throws the neural network, making it classify a panda as a gibbon or an abstract pattern of yellow and black as a school bus. Modern foundation models keep being susceptible to jailbreaks and prompt injection attacks. For all their proficiency at recognizing patterns, neural networks clearly do this in a different way than we do.

Similar things can be said about reinforcement learning. Seemingly miraculously, we can train neural networks to play games or control robots based only on feedback on their behavior. It’s astonishing that this works at all; it’s essentially trial-and-error on a massive scale. But why is such massive scale necessary? DeepMind’s classic experiments on learning to play Atari games with deep reinforcement learning saw each game being played for an equivalent of 38 days of game time. In contrast, a human can usually learn to play such games in less than an hour, sometimes in mere minutes. Of course, humans do this partly based on their familiarity with other games, as well as a lifetime of learning other visuomotor skills, from hopscotch to chopping onions. Artificial reinforcement learning systems are not good at this. Typically, they struggle to generalize beyond the narrow setting they have been trained on. Those networks that spent 38 simulated days to learn a simple Atari game? If you make a tiny change to the game, such as remapping the colors, or changing a few pixels here and there, they become utterly helpless.

There are other paradigms within AI that contextualize our own intelligence in other ways. Such as evolutionary computation. By (often crudely) mimicking Darwinian evolution, we can solve a large variety of problems. Evolution can come up with new designs for antennas, surprising but lucrative trading portfolios, useful software, and many other things. Evolution can also be used for supervised learning and reinforcement learning, often with more or less equally good results as the more commonly used gradient descent methods. But isn’t this weird? How can we get such good results through a completely different type of algorithm? Clearly, the currently dominant paradigm of AI is not the only way of solving the various problems we use AI to solve. 

The viability of evolutionary computation also reminds us about the perhaps greatest product of natural evolution: us. We are evolved beings, with an evolved culture. The main reason that we perceive ourselves as being generally intelligent is that we have built a world tailored to our shared cognitive capabilities. These capabilities have evolved over hundreds of millions of years. When we come to the world and start thinking, we are not blank slates; we build on an intricate neurophysiology and a vast repertoire of skills, instincts, and perspectives, some of which might at some point have helped our ancestors pick non-poisonous fruit, outwit crocodiles, or predict when the rain would come. In contrast, a machine learning model is quite literally a blank slate before training starts. Or rather, a blank matrix. Unlike all AI in existence, we are “trained” in a multi-timescale distributed process, encompassing our whole phylogenetic lineage as well as our whole culture.

Which brings us to present day. We now have large language models, and they are like the mind of god. At least according to breathless hypesters and accelerationists. More sober commentators still recognize that they are some of the most impressive technology we have ever seen, and they may well turn out to be almost uniquely consequential. We have all been humbled by LLMs doing something we didn’t think they could. Some of us multiple times. What are their shortcomings?

To begin with, they are good at tasks largely in proportion to how easily these tasks can be represented as strings. If the input is text and the output is text, chances are the LLM can solve the task very well. Modern multimodal models are also now very good at generating and classifying images, which are internally represented as strings of tokens. But spatial reasoning and interaction is another matter. Currently, huge resources are spent on trying to make these models confidently interact with graphical user interfaces. Granted, they are getting better at it. But they are still atrociously bad at, for example, playing video games. (Unless the game is very well known and you build an elaborate harness for it.)

It is likely that multimodal models will soon get much better at spatial interaction, at least for tasks that are economically relevant. The bigger issue is the lack of memory and continual learning. The current state of LLM memory is like the protagonist of the movie Memento, an amnesic man who can’t form new memories, and therefore has to write little notes to himself (or tattoo notes on his body) to remind himself who he is and what he is doing. This is because an LLM does not modify its parameters as you interact with it. All the little numbers that define it remain frozen in time. Instead, it keeps a short-term memory of its interaction in its context, but the length of this context is necessarily limited. To achieve something akin to long term memory, the harness around the LLM will at intervals summarize its context as a text file and store it away in a kind of database, which it can then access in the future. Rather like writing little notes to itself. This is likely to be a fundamental limitation of LLMs, not in the sense that it cannot be overcome, but in the sense that the solution will look quite different to an LLM as we know them today.

There are other ways in which LLMs differ from us which have not yet understood fully, because the technology is so new. For example, just like other forms machine learning, LLMs appear to have a strong bias towards problems that are in its training set. One way this manifests is a curious lack of novel insights stemming from LLMs recombining existing knowledge. Very much, if not most, of human creativity comes from recombining existing knowledge. Now, LLMs have a broader range of “expertise” than any human ever. Which actual human would simultaneously have detailed knowledge of peat bogs, pupillometry, polymerization, Pasadena, and Paul Krugman? In fact, frontier LLMs have had this staggering breadth of knowledge for at least 3 years, since GPT-4. A human with such range would surely make a stream of unexpected connections. Yet, few if any truly novel insights are directly attributable to LLMs. Why? We don’t know. We also don’t know whether this is a fundamental limitation of this approach to AI.

So far, we have only talked about intelligence in a relatively abstract information-processing sense. But we are not just brains, we are whole bodies. As you may have noticed, the way you think is strongly affected by whether you are hungry, horny, angry, or something else. And much of your thinking involves your body in some way, whether it is walking, tying your shoelaces, or typing on a keyboard. Some argue that all of your thinking is rooted in your body. Opinions diverge within cognitive science as to how important the body is to thinking. But what is plain to see is that physical robots are far, far behind non-embodied AI. Robots struggle to do things that are trivial for us, such as opening door handles. This is not a new issue: it has been the case for the whole history of AI, and much commented on.

Replaying the history of AI this way, we can sketch a different understanding of human intelligence and human nature than what we would get from using the AI we built as a metaphor for ourselves. More precisely, we can paint a picture of intelligence that emphasizes the parts which our AI systems are not good at, or which they do in a very different way to us. We can emphasize the complementary part.

Let us consider the difference. If we did use AI as a metaphor for our own intelligence, or as a lens for understanding it, similar to how previous generations used clockworks or steam power, we would arrive at what could be described as a rather classicist picture. Human intelligence operates by considering a large range of alternatives, tries to solve specific tasks that have well-defined rewards and can be clearly separated from other tasks, learns each skill on its own, starts from a blank slate when learning, and sees task descriptions and world descriptions largely as text. This picture has echoes not only of philosophies of past centuries, but also of modern management thinking and the kind of postmodern thinking which sees everything as a “text”.

The complementary intelligence view is instead that we are creatures that are deeply rooted in our history, both our evolutionary history as a species, our cultural history, and our personal history. Context is what we excel at. Most of what we do cannot easily be stated as separate tasks with well-defined rewards. We learn and reason slowly in terms of clock time, but effectively in terms of number of examples we need to see. We almost never operate according to logical rules, though we may tack them on as justification for what we did. Text is just one of our modalities, and somewhat “tacked on” compared to for example sight, smell, or proprioception. The body plays an important role in our thinking, and fine manipulation is another thing we excel at.

Neither human nor artificial intelligence is “general” in anything but a trivial sense, and could never be. The reason we believe we have general intelligence is that we live in a world we have constructed over the course of our civilization to fit our capabilities perfectly. Our societies, technologies, and built environments are scaffolding and support systems for our very particular type of intelligence. This makes us feel very smart and powerful. But thinking that the particular capabilities that the world we constructed test and amplify is all there is to intelligence is a very parochial view.

Looking at human intelligence this way gives us perspective on the rapidly advancing capabilities of AI. It is often asked when AI will overtake human intelligence. But this assumes that intelligence is a single-dimensional quantity. The various types of machine intelligence we have created can instead be seen as vectors pointing in different directions. Classic symbolic planning is one vector, LLMs are another, and fuzzy logic also another. Human intelligence is yet another direction. Moving further along in one direction (increasing the magnitude of the vector) may have limited bearing when projected on other intelligence vectors.

So, to directly answer the question about when artificial intelligence will surpass human intelligence: it did so long time ago, many times, and it never will. Various technologies that we refer to as artificial intelligence have surpassed humans at calculating, planning, solving logical puzzles, factual recall, and many other things. Yet, it is extremely unlikely that any technology would have exactly the same intelligence vector as human intelligence. Because these machines are not humans, their intelligence will always point in different directions.

Interestingly, the history of AI can be seen as a sequence of attempts at approximating the human intelligence vector. The moving goalpost phenomenon then becomes a game of finding a particular point on this vector, only describing it in one or a few dimensions, and trying to invent something that reaches this point. We then reach that point, only to discover that we did so by following a completely different vector than then we tried to imitate. So we find a new point, and repeat the procedure.

We can use complementary intelligence as a term to describe this view, but we can also see it as a positive research program. Complementary intelligence as a direction is, basically, to lean into the difference. We should not try to eliminate it, instead we should support it. Recognize the strengths of the human intelligence vector and build AI systems that amplify it.

At the same time, we should try to move away from trying to approximate the human intelligence vector. For example, plenty of AI researchers around the world are currently working on how to augment LLMs with continual learning, because they realize that’s not something that will be found along the intelligence vector of current LLMs. I think we should take our efforts elsewhere. Simply imitating human capabilities is perhaps the least interesting way of building artificial intelligence. It’s so unimaginative. And it leaves so many potential capabilities on the table. You could even argue that fully imitating human intelligence is immoral. We don’t actually want artificial intelligence that has all the capabilities we have, and is better at all of them. Because we want to matter. And we don’t want to be replaced.

Instead we should seek to amplify machines’ capabilities at tasks that we humans are not particularly good at, or do not want to do. In the best case, tasks that are not done at all, because we can’t or won’t do them, even though we might want to. We want AI that lets us focus on the things that we want to be good at, and give us abilities we didn’t have before.

To take some very quotidian examples: high-frequency trading is an example of complementary intelligence, because we can’t trade that fast. It is literally physically impossible for humans. AI deployed inside a video game, to generate levels, control non-player characters, or something like that, is another example of complementary intelligence, because you could not have humans implementing every NPC, and they would probably be very bored if they were asked to follow the rules that NPCs follow. AI methods for helping us make sense of modalities we are not attuned to, from WiFi reflection to gravitation fields are also good examples, as they expand our perceptual space. Then there is of course the boring but immensely impactful technologies that make the modern world possible and does not replace any cognitive work people actually want to do, such as databases and web search.

But beyond this, there is a virtual infinity of new types of intelligence we could develop, and new tasks we could discover, and new solutions we could invent. Taking the metaphor of intelligence vectors seriously, we could envision a hypersphere of capabilities on which every possible intelligence vector, at its maximum magnitude, could only reach a particular point. By definition, we have only explored an infinitesimal part of the interior volume of this hypersphere. There is so much more to do.

Most of these types of intelligence would likely be uninteresting and indeed incomprehensible to us humans and our society. But there is likely to be a practically infinite number of directions we could appreciate and build exciting new capabilities around if we first invented them. I don’t know what these capabilities would be, because they have not been invented. But I think the semi-automated open-ended search for new types of intelligence and associated tasks that would likely be of interest to humans to be the most exciting direction for AI I can imagine. This will definitely require new thinking about open-ended search, discovery of new ways of measuring search space, and clever measures of what humans find interesting.

As you can tell, there’s a lot to be worked out here. I’m thinking I should write my next book about Complementary Intelligence, so I get a chance to work some of it out. What do you think, should I?

Sunday, March 22, 2026

Computers and me

I read things, write things, and talk to people. The proportions vary, but that’s essentially what I do. Or rather: those are my observable activities. I also think. The thinking often happens when I read, write, or talk, but also when I walk, drive, or take the subway or elevator. And when I shower! That exclamation mark! I should shower more.

Once upon a time I also programmed. I even considered that my craft, on par with writing. The last time I wrote non-trivial code was 2015. Long before that, I’d stopped keeping up with modern toolchains and software development practices, or even languages. That’s actually partly why I stopped programming: nobody uses Java for AI research or SVN for version control, and I think Python is unacceptably sloppy and git is incomprehensible.


Of course, the main reason I don’t program anymore is that I’m busy reading, writing, and talking. And thinking. I enjoy those things more. It’s not that I didn’t like programming: I enjoyed it a lot. And I was quite good at it. But there are lots of enjoyable activities you don’t easily find time for when you have two jobs and two kids. Even activities you’re good at.


Before I programmed, I took things apart. First, my toys. My room was full of useless thingamagogs that had once been part of fully functioning toys. I was no good at putting them back together again, or I didn’t have the patience. Or the interest. At some point, I graduated to computers, and built various PCs from parts I bought cheaply from flea markets or badgered my mom’s friends to give me. I destroyed a lot of those parts in the process. It was a lot of fun.


The PC-XT clone I bought with the proceeds from my first summer job (as a gardener) when I was 13 had a Turbo Pascal IDE on its 20 Mb hard drive. I decided to learn to program so I could make games. I copied and pasted things and tried to figure out what worked through trial and error. I learned a thing or two. Later on, I also spent a lot of time composing music on a 486 I built myself, and learned the basics of website building on the same machine. I never even fastened the hard drive to the chassis, and the computer had blinking lights and some kind of glitch so that you might get an electric shock from touching it.


These days, I don’t want to see the insides of my computers. I use Macs, and I want them pristine. No stickers, clean desktop, and no unnecessary applications. As few customizations as possible. It’s like I’m not even interested in computers anymore.


In sum, I’m a bad computer user. I do not let my computers fulfill their potential. Basically, I use the computer for reading and writing. Anything I actually use my computer for could be done on a 20 year old machine. If it could connect to the internet, I could do what I do on a 40 year old computer.


Yet, I keep buying new computers. I happily hand over my employers’ money to Apple in exchange for swanky new gear with waaaay more power than I need. And I don’t feel bad about it. I tell myself that I need an M5 Max with max memory so I can run local LLMs, and that is in fact a minor hobby of mine, but not really important to my actual work. Most of the time I use my computer for reading and writing emails, or reading papers or web pages, or having Zoom calls. My jacked monster of a swole M5 processor must be really bored.


I think I like computers mostly for aesthetic reasons. I’m like a rich old man who buys a Ferrari only to drive it around town and never exceed the speed limit. I just want to hear the menacing growl of the V8 and admire those aerodynamic lines. Except I’m not rich, and not that old, so I buy computers instead.


I’ve been thinking about this recently because computers are finally learning to use computers. Fat harnesses around frontier models help them navigate various applications, and this means you can increasingly just ask your computer to do things for you. Language models can also write code really well now, so you can (sometimes) conjure functioning new software just by calling it by its true name. Thus, it’s all the rage to make your AI agents do things for you. Writing code, reading reports, answering emails, other computer things.


Some seem to want to automate all of their digital life. Some seem to think it’s a good idea to install OpenClaw and give it root access to their computer and logins to all their computers. These kinds of people remind me of myself when I was 16, deeply into building weird things that rarely worked just for the sake of it, customizing every piece of software and interface because it’s cool, and caring not for safety nor security. I try to keep in mind that I was also once like that, because that allows me to understand these people. They just love technology in the way I once did.


Anyway. I am allegedly an “AI researcher”, a type of “computer scientist", and this comes, I think, with the obligation to at least occasionally act like one. So I try to use all these frontier models like I was Buffalo Bill. Often, it involves looking hard for some need I barely have that might be satisfied by a language model. This task is getting harder and harder. For what do I actually need? What should I use these things for?


A friend of mine suggested I vibe-code some unique software just for me. What kind of software, I asked. He said he had made some software for himself that keeps track of his exercise routine just the way he wants it. But I don’t want that! The point of going to the gym is to not have to care about such things, and instead put on the headphones and zone out while incinerating calories and letting the mind wander. Also, I don’t want to have to take care of maintaining a piece of software, even if it’s just for myself. Unnecessary stress. In fact, I want less software, not more. There’s far too much software in the world already. For any given need, there’s probably an app for that already, but I don’t want to have to look for it and I don’t want to install even more apps. That my local lunch restaurant has its own app and pesters me to install it is proof that too much software is being written.


What else could I have the models do for me? Write for me? But the whole point of my writing is that it’s mine. It’s not so much that it would be immoral to put my own name to something an LLM wrote (it certainly would), but that it doesn’t even make sense. It just wouldn’t be my writing. Please don’t tell me I need to explain this to you.


Could I have the models think for me? But I thought we already established that I am in this job because I like thinking. If you want to avoid thinking, you should not become an academic. Imagine going to a restaurant, ordering food, and then paying extra for the waiter to eat the food as well. That’s right, eating the food yourself is kind of the point. (My original metaphor was more striking, but this is a family-friendly blog.)


The more I think about it, the more the advent of AI agents has made me realize that I’m not much of a computer user. I don’t care for the vast majority of things you could make a computer do, and I don’t want to bother with new software if I can avoid it. Please don’t bother me with your buzzwordladen productivity catalyst. Give me a text editor and shut up already.


Maybe I would rather not even use computers. The computers can use themselves now, so maybe we can get on with our lives? Computers aren’t real anyway. So let me valet my shiny laptop. What I really need is some good books, some good friends, and a typewriter. And some good wine. So I can read, write, talk. And think.


Sunday, March 01, 2026

Saving peer review from AI slop requires getting rid of anonymous submissions and reviews

The scientific ecosystem is struggling to deal with AI-written papers, and this is a great opportunity to revisit how we publish, where, and why. As many have noticed, a properly prompted modern LLM can produce complete papers that look like real science to qualified scientists in many fields. Yes, really. Whether these papers are actually correct, novel, interesting, insightful, and get their scholarship right will vary depending on the scientific field, the AI model, the observer, the type of paper, and of course the prompter. Better not get into specifics here, especially as the situation is evolving rapidly. The point is that it is now easy to produce what looks like good papers with little human effort. 

So, how do publishers, journals, conferences, professional societies, faculty admissions committees and peer reviewers (that is, you and me) deal with this? Not very well.

Scientific publishing is really badly configured for this challenge. For a while now, we have had a movement towards ever larger publication venues and a more anonymous process. I'll talk here about computer science, but I think the winds have been blowing in the same direction in other fields. The largest computer science conferences (such as NeurIPS, CVPR, and AAAI) now have many thousands of published papers at each conference, with tens of thousands of attendees and submissions. Reviewing is at least double-blind: reviewers don't know the identity of the authors, and authors don't know who the reviewers are. The area chairs, who make the first round of decision recommendations, also don't know who the authors are.

This is partly done for reasons of equality and fairness. There is this beautiful idea that anyone from anywhere in the world, from an elite university in Tamil Nadu or a rural high school in Tennessee, with or without powerful mentors, can just submit a paper and have it judged on its merits alone. As we all know, who you know and who knows you matters for your exposure. But the multiple anonymity principle was supposed to counteract this. It was also meant to make it possible to speak truth to power, so that high school student in Tennessee can point out the errors of my ways just as much as the professor in Tamil Nadu.

The concentration of academic publishing into ever larger venues, on the other hand, is probably mostly due to academic bean counting. It's genuinely very hard to gauge the strength of a researcher who is not in your own narrow specialty. But we often have to do that, because we need to decide on hiring and promoting researchers. So we need metrics. Citations are one such metric, though it has many problems, including that it takes time to get cited. Therefore it is common to look at the prestige of a publication venue. Conferences become prestigious by being very large and rejecting most submissions. There are other reasons as well why conferences have ballooned like this; I've written about this phenomenon and my dislike of it before.

The end results of this combination of idealism and economic incentives has led to a breakdown of scientific community. Think about it. Why would you, my peer, review a paper? You don't get any recognition for it, it's not worth mentioning on your CV (unless you're a masters student), and you certainly don't get paid to do it. The authors of the papers you review don't know about the effort you put in, and your parents, friends, spouse, and kids don't care. They just wonder why you are spending your Saturday afternoon tearing apart some paper you don't care about submitted by some people you may never meet instead of hanging out with your loved ones. You say you do it "for the community". Which community? You are just an anonymous cog in the machinery. 

This is not a theoretical concern. It has become steadily harder to find reviewers for at least a decade. (I've seen this from a bunch of different angles, including as a reviewer myself, as part of who knows how many committees, and as the editor in chief of a scientific journal of some repute.) In case of the larger conferences, they are basically vacuuming every nook and cranny for anyone remotely competent to review, including bright undergraduates. It is now common to require authors of papers at large conferences to review a number of papers themselves, as a kind of tax for submitting. Obviously, reviewers who are conscripted this way are not highly motivated to do a good job. The consequences of half-assing it are essentially zero. So the temptation is to just ask Gemini or Claude to write the review, changing a few words, and calling it a day and go out and play. Sticking it to the man. But the man, the machine, is the whole scientific system that we carry on our shoulders.

Into this already dysfunctional mess of a system enters a new factor: AI-written papers. En masse. If it's this easy to write papers, you can just bombard the system. Buy many tickets to the lottery. Peer review is so broken that some of them are likely to get through. And you're anonymous. If you get rejected, you never have to reveal your name.

Alright. How can we patch up this sinking ship? We must, because we are on the ship.

First of all, I am not arguing that we should ban the use of AI in the publication process. I think that in the future, most research will be done by humans with the help of an assortment of AI systems. In some cases, the AI systems might have contributed work that would have taken humans extreme amounts of time and effort to do; this is not in principle different from how computer-aided research has been done for decades. The exact amount and character of AI involvement will vary. But for any paper worth publishing, the whole text should have been written (in some revision) by a human, and checked (in its most recent revision) by a human. Here, the general principle of "don't make me read what you didn't write" applies. If the authors of the paper couldn't be bothered to write it, they are not publishing in good faith.

When you think about it, it is kind of odd that peer review works at all. We, and journalists, and to some extent the general public, take the fact that a paper is peer reviewed as a sign that it is true. But when we read the paper, we mostly just believe the statements in it. Sure, we hunt for really bad logic and bad scholarship, but we usually just accept factual statements of the type "algorithm X was faster than algorithm Y, p=0.04". What if the authors just lie to us? In some cases, we can run the code ourselves from an anonymous GitHub, but it's really quite rare that people do that. And running the code rarely answers all the questions. Instead, we just assume that the authors are honest people. Why do we this, again? Because the author are our peers? Are they?

Cue scene of a mortgage broker at Lehman Brothers circa 2007, fresh from buying billions in bad loans, rolling their eyes at the gullibility of the scientific community. Like, these professors and researchers with PhDs just accept the statements of anonymous people on the internet? And we thought they were smart? 

What it all comes down to is accountability. A human should be accountable for every piece of research, and stake their name that the research is theirs, produced in good faith, and, as well as they can judge, correct. This should apply at all levels of the publication cycle, at the time of initial submission as well as for the final published paper. As a reviewer, you deserve to know who wrote the paper, because you need to know whether you can trust them. Their name should be a key reason that you trust the paper. And when someone submits bad science, or lies, this should have negative repercussions on their name.

Getting rid of anonymity as the default can help save reviewing as well. Think about it. Why do you review, again? Because of some kind of abstract commitment to the scientific community. But, as we've seen, this is not working very well. What would work is to pay reviewers in the same currency as academics always get paid in: recognition. (Yes, academia is full of narcissists, the same way fields where you get paid real money are full of greedy people.) Simply attach the reviewer's name to the review, so they can brag about it. Even better, they will have an incentive to actually do a good job.

Sure, there would need to be some kind of anonymous review option, so that the child can point out the naked emperor. But it should not be the default. There is also the concern of "review rings", where authors coordinate the boost each others' work. I think those are best fought by shining a light on them. If submissions and reviews are public, people who boost each others' substandard work will look like the fools they are.

All of this relies on the idea that your reviews and paper submissions can have negative consequences as well as positive ones, if they are bad. You may object that this is unlikely as long as we are all atomized participants sampling from an almost infinite width stream of papers. If you see a bad paper with authors you've never heard of, you have no incentive to do anything about it. You'd rather just keep scrolling.

The solution to this is to actually take scientific community seriously. You should be making and defending your name in a community of at most a few hundred people. This is why primary submission venues should be of such size that participants who have attended for a few years can realistically personally know a large proportion of attendees. This is my experience with venues like IEEE Conference on Games, AAAI Artificial Intelligence and Interactive Digital Entertainment, and ACM Foundations of Digital Games. These are the conferences I prefer submitting my work to, and also going to. They usually have between 100 and 300 attendees. Unlike supersized conferences like NeurIPS, ICML, and IJCAI, fun-size conferences allow you to actually get to know the research community. I, like most other repeat participants, have my own opinions of who does research I care about, novel research, high-quality research, and so on. I think this is how it should be. Your own manageably sized research community should be the first port of feedback on and judgement of your research output.

This does not mean that you can't post your work online for anyone in the world to read. On the contrary, I very much think you should. But everyone does this anyway; in 2026, if someone writes a paper and does not submit a preprint of it in some publicly accessible place (such as arXiv, GitHub, or their own website) that's just weird. Maybe a little shady. As if they had something to hide. People should obviously keep posting their work publicly for their world to read, but approval by their own research community should be a strong signal that it is worth reading. And in a research community where people know your name, and you attach your name to your submissions and reviews, you can't get away with bullshit, AI-powered or not. In case the research community gets corrupted and starts letting its members get away with bullshit, the standing of the whole community would drop and people would cease to trust it.

Let's get back to the AI disruption. What if the machines get so good at research that they start producing papers that are actually good and novel? Then it becomes even more important that a human acts as owner and guarantor of the research. As I've (somewhat controversially) argued in the past, it is essential that we retain human control of the scientific process.

More generally, the fact that AI systems are getting better at various tasks with the research process is a good reason to re-examine the role of humans within the research process. And it is becoming ever more clear that research is a long game. AI systems excel at limited-duration tasks that can be clearly specified and evaluated. The rule of humans will increasingly shift into the really thorny stuff, questions without clear answer or evaluation. Such as: what research are you doing and why? Research is a long game. Gemini 3 has a context length of a million tokens, but your context length is your entire career. Your whole life, really, taking into account those childhood experiences that turned you into the weirdo you are, obsessed with whatever obscure questions you care about. In light of this, it's more clear than ever that the individual research paper is not the level at which your research should be judged. So let's make the scientific process more personal, relational, community-based, and human. 



Sunday, February 08, 2026

Math and me

For most of my adult life, I was too cowardly to write this text, never mind posting it. I was worried about what people would think, and the repercussions on my career. Would people still take me seriously? But I’m now a whole full Professor of Computer Science at a top university, with all kinds of fancy metrics and titles to point to. Time to stop being such a pussycat.

Here’s the thing: I’ve always been terrible at math. How bad? Tell me to solve a quadratic equation, or differentiate something, and I would have no idea where to even start. I usually skip right past the equations when I read a paper because I don’t understand them. Last time I proved a theorem was approximately never.

I also always hated math. Not the abstract idea of math, but math as it actually exists. In particular, the activity of doing math, and trying to get stuff right. I hate math because I’m so bad at it, but clearly my negative feelings towards the topic is not helping me get better at math.

I almost failed maths in high school, and all my memories of math class in high school are of me staring out the window, talking to friends, writing weird stories, or programming my calculator. Anything to avoid those detestable math problems. During my undergrad, I had to take an introductory calculus class in order to take some computer science class I wanted to take. I failed the exam for that calculus class four times, and only passed on the fifth try because I realized that one of the professors was reusing his old exams with very minor changes. I learned basically nothing from that course. And not only do I not know how to differentiate anything, I also never learned things such as matrix multiplication or other parts of linear algebra that are supposed to be crucial for AI researchers like me.

Our PhD program requires my PhD students to take some theory courses that I’m pretty sure I couldn’t pass myself. I’m not even sure I could make it through our required undergrad theory courses. Some kind of computer scientist I am. The reason I could get a bachelors degree is that my undergrad is in Philosophy, though I did take a bunch of CS classes.

Which brings us to the question everyone asks, even though they often don’t believe my answer. The question is: how the hell can I be a successful AI researcher without knowing math? The implication is that I’m lying, or at least grossly exaggerating, because we all know that machine learning is very mathematical. It must be, because those GPUs are multiplying matrices all day. I’ll try to answer this below. Please bear with me, I’m trying to be as honest as I can here.

My first instinct is to say that mathematics is not important to the research I do. I never need to prove a theorem or even rewrite an equation. The details of how the matrices get multiplied don’t matter to me. I deal in ideas and code. Not math.

I remember when I taught myself programming using a Turbo Pascal IDE I discovered on the used computer I had bought when I was 13. As I blundered my way through the intricacies of Pascal, mostly by trial and error, I felt that a beautiful new world was opening up to me. It was hard, but I could learn it, and I had talent for it. Writing program code felt pretty much like writing natural language. And I was always good at writing. One of the things I learned about was variables. Some time after that, we were introduced to variables in school. I was excited, as here was a concept I actually knew something about! I was pleasantly surprised that I seemed to understand variables better than anyone else. But this didn’t help with the mind-numbingly boring stuff we did in maths class, all these exercises up and down the page.

In my undergrad, after two years of philosophy and psychology, I started taking computer science classes. I was naturally good at computer science. I understood the concepts and I became a cracked programmer. It was a lot of hard work but that was not a problem, because it was so fun. It was very different to studying philosophy, where I would just read the book and ace the exam. Mathematics, on the other hand, was all hard work and no understanding, and I couldn’t pass the exam at all.

In short, I was good at writing, philosophy, programming, and most aspects of computer science, and saw these subjects as intimately related. At the same time, I was terrible at math. So you may understand how I can see maths as largely unrelated to what I do.

And yet, I often use mathematical concepts when I talk about my research. Actually, when I do research as well. A recent project of ours focuses on embedding programs represented as syntax trees into a latent space that can the be searched efficiently. This involves considerations such as keeping the dimensionality of the space low enough to allow covariance matrix calculation and how to regularize the search to stay within the training distribution. That’s a bunch of mathematical terms there. And they mean something, because reasoning with them is how we got the method to work so well. But please don’t ask me to write down the equations.

So, how do I reason with mathematical concepts if I cannot do the symbol manipulation? Mostly visually. There are these little images of these things going on in my head, like a search blob moving against a gradient in a latent space. The images are somehow incomplete and clearly misleading–it is impossible to visualize a 128-dimensional space, so you have think of it as two-dimensional–but they are useful. But I also sometimes think of them in terms of program code, and the program code often comes out as animations, e.g. I see the program counter looping in a for-loop. It’s not clear to me how being able to to do the symbol manipulation (e.g. rewriting the equation for for the encoder function in some other form) would be of any help in reasoning about the algorithm. But that might just be because I don’t know how to do the symbol manipulation. If I did, maybe I would see new possibilities.

There are other uses of mathematical concepts which are possibly even fuzzier. A key skill in designing algorithms is understanding approximately how they scale in time and space. This basically boils down to figuring out what operations take time and which data takes space, and then having a mental picture of how many of them there are. Quite often, you’re counting loops. I learned the basics of doing this formally back in undergrad, but I haven’t done a formal analysis of an algorithm since. But I do loose, very informal analyses a lot when thinking and talking about algorithms. They help. But please don’t ask me to write them down.

Could it have been different? Could I have become the kind of person who was genuinely good at maths, enjoyed it, and perhaps even published papers with mathematical results of my own?  Who knows. The closest I ever came to thinking I understood math was during a discrete maths course in my undergrad, which I found myself actually enjoying, although it was a lot of work. For a little while I felt like math might actually be for me. I’m not sure if this was because of the topic, as discrete maths felt discontinuous with all the continuous maths I’d learned to not learn so far. Maybe it was mainly my very inspirational teacher, Thore Husfeldt. In either case, the feeling dissipated as soon as I encountered that analysis class, the one that I failed four times. 

As I write this, I keep fighting the impulse to brag about how successful a researcher I am. “Trust me, I’m a good researcher even though I don’t know math, see, I published so-and-so many papers and got so-and-so many citations and won this-and-that award.” I hate being that guy. So I’ll keep fighting that impulse. But it speaks to how deeply the impostor syndrome has taken root. Enough people have told me that I cannot possibly do what I’m doing without knowing a lot of math so that I’ve somehow think I can’t do what I do.

If you’ve read this far, you may wonder where I’m going. Who am I writing this for, and what am I trying to say? Let’s discuss some alternatives. 

I’m definitely not saying that you shouldn’t study math. If you like mathematics, go ahead and study it. It’s useful (I know) and beautiful (they say). I have a lot of respect for theoreticians and wish I could do what they do.

Another thing I don’t want to do is to blame my teachers. Maybe it was my teachers who taught me that math was boring and that I was bad at it. Maybe it was their curriculum they had to follow. Maybe it was me. Other people seemed to enjoy those same math lessons, after all. Dear teachers, thank you for trying to teach me; I don’t think you and I were good fits for each other, but that’s not your fault.

More likely, I’m writing this for those of my colleagues who are in the same boat as me, who somehow became successful computer scientists despite sucking at math. I’m like you, guys. We exist. I also write it for those of my colleagues who actually do know a lot of math, to explain how I work.

But I also write it for myself, because I genuinely don’t understand. Do I actually know a decent amount of math? I use those concepts all the time. But I certainly can’t solve any exercise problems. What does it mean to know math, anyway? I think the idea that you need to start from the basics and solve all those boring exercises to even learn about the more interesting concepts is male-cow-excrement. Or maybe that is one way of approaching mathematics, but far from the only one.

Most of all, I write for those who have been thinking of learning computer science, but are afraid to try because they don’t like math or are bad at it. You can certainly do it. You can become a very good computer scientist despite sucking at math. If anyone tells you that you can’t learn, say, machine learning because you don’t have the “mathematical fundamentals” tell them to go to Helsinki. In the winter.

There are some strong feelings involved here, and I should perhaps stop writing now before I get more explicit. And I should post this before I go back and re-read it and start toning it down. Better post it fresh and raw, like sushi.

Tuesday, January 27, 2026

What does it mean to be good at using AI?

They say we should educate people about AI, because we all need to get good at using AI. But what does it mean to be “good at using AI”? I’m not sure. Understanding the technical underpinnings of modern AI models only helps a little bit; I’ve done AI research for 20 years and I’m not sure I’m a particularly skilled user of AI. But here are my two cents, and 2800 words.

It seems to me that there are no magic bullets for efficient AI use. In the recent past there were various incantations you could use that would somewhat mysteriously get you better results, such as telling the model to “think step by step”. Alas, such incantations matter less these days. In general, language models and their associated systems are good at understanding what you tell them, and they improve rapidly.

So what is there to learn? I think the best way to get good at using these beasts is to use them a lot, and try to vary how you use them. I’ve been trying to think of what the main challenges are when using modern LLMs as I’ve interacted with them. Here are some main skills I think you need, in increasing order of technical and existential difficulty.


Expressing yourself clearly 

However capable the model is, it doesn’t live inside your head and can’t read your thoughts. You need to tell it what you want from it. You can also not assume that it has the context of everything you’ve experienced in your life. It most likely doesn’t even have the context of the situation you are in right now. Stating what you want clearly is a transferable skill. It is more or less the same skill you need for outsourcing work to a contractor or explaining an assignment to your students. Not everyone is good at it; I have seen many professor colleagues give woefully incomplete or ambiguous specifications to students, for example. Sometimes, I’ve done so myself.
Elucidating your intent via dialogue is useful, but could also lead you astray. It is very useful for the student, contractor, or language model to be able to ask follow-up questions. These may in turn spur you to think of aspects of your original request that you did not think about. You may even understand what you wanted better. However, the follow-up question may also end up leading you in a completely different direction; notice how often an LLM helpfully asks “would you want me to…?”. Expressing what you want clearly from the start is how you actually get the answer you want. And clarity of expression requires clarity of thought.

Appropriate skepticism

Language models are not inherently truthful. At their core, they produce probable tokens. In other words, they produce true-sounding bullshit. In the early days, this meant that you couldn’t really trust anything they said. These days, great strides have been made to reduce confabulations (a.k.a. hallucinations), and if you ask a good language model about something widely known, you can generally trust the answer. In other words, the bullshit is very often true and useful.
A key reason that language models have become more truthful is that they look things up on the web. Basically, they do the same thing as you would: they google things when they don’t know. To understand how important this is, try using a state-of-the-art language model with web search turned off (this is is possible for example with Claude, or if you have a beefy computer that can run good models locally). If web search is turned off and you ask the model about a niche topic that you know well, chances are that it will bullshit worse than a drunk politician.
Now you may wonder, if web search is turned on, do you still need to be skeptical? Yes. Because, as you may have noticed, not everything on the internet is true. And LLMs are gullible.
To understand this better, have an LLM with web search turned on compile a report, complete with sources, for you on a subject you know well. All the leading model providers have “deep research” functions that do this. You will likely find that the referenced material is all over the place: peer-reviewed papers, news articles, forum discussions, even marketing material. It is often hard to know who to trust, and the task is not easier for a language model. It doesn’t matter how advanced the neural network is, it does not magically know things. There is no escaping epistemology. For you, the user, finding out who to trust just got harder, because now the disparate sources are filtered through the same model and presented to you with the same authoritative voice.
A relevant concept here is Gell-Mann amnesia, a concept introduced by Michael Crichton. Yes, the author of Jurassic Park. Gell-Mann amnesia refers to how you forget to doubt statements outside of your area of expertise when they are presented by a source you consider authoritative. Crichton takes the example of reading about the movie industry in a newspaper, and complaining about how the journalists get everything wrong. He would then turn the page to read about something completely different, for example particle physics, and unquestionably accept what he read. But why would the journalists be better at writing about particle physics than they are at writing about the movie industry? Now think back to your experience of asking the LLM something on a topic you know deeply. And then asking the same LLM about something in a topic you don’t know.
Personally, I trust what comes out of a good LLM about as much as I trust what I read in a tabloid newspaper or what I see on TV. Or perhaps as much as I trust a peer-reviewed paper in a venue with loose standards. All of these are useful sources of information, but require skepticism. And exercising appropriate skepticism on a topic you don’t know well is hard.

Knowing what you want

With great power comes the question of what to do with it. LLMs give you great power. At least within certain domains. You know that feeling when open the fridge and just stare at the food inside, not remembering why you went to the fridge in the first place? That’s me, in front of Gemini or Claude, sometimes.
At any given point in time, there’s an infinite number of things you could possibly do. There’s an infinite number of questions you an ask, apps you can build, analyses to run, and so on. Most of them are not what you should be doing right now. In theory, if you always chose the best possible action you could take in order to maximize your overall objective, you would be much more successful than you are right now. But most of the time you don’t think deeply about what to do or ask next, because that would be absolutely exhausting. 
Let’s say that you come to your AI tools intentionally, with a concrete task to do. You want to write a text, analyze some data, understand a paper or perhaps create an app. Where do you start? You could simply put in the overall idea into the prompt, something like “help me understand this paper” or “build an app that balances my household budget”. Very likely you will get a result other than what you wanted. This is because any complex request hides a myriad of small design decisions. Either you make those decisions, or the model will make them for you. If the model makes them, it will probably choose very generic alternatives. So you will want to provide lots of details, and likely break down the task in many steps. This, in turn, requires that you actually know what you want to do with the AI system. Not just understanding a paper or building a budget app, but which part of the paper you want to understand and in what terms you want it explained, or which features you want in the budget app and what the interface should be like. Choices choices choices. Making all of those choices is hard work, but it’s your work.

Knowing the other

There is a tendency to look at what an AI model does best, and think that that is how “intelligent” it is. But your view of intelligence is always relative to some implicit idea you have of what a human can and can’t do. But that is not how AI works. Whatever an LLM is, it is not a human, and does not have a human-like distribution of skills.
The very same LLM that knows more than any human has ever known and writes working software from scratch can entirely lack spatial intuition, make ridiculous errors in image generation and have the memory of a goldfish, forgetting the start of your conversation. It is very confusing, because your intuitive notion of intelligence keeps intruding and insisting that if someone is good at A, they should also be good at B, like a human would be. The sensible thing to do is to forget, or at least put aside, your notion of intelligence and start keeping track of capabilities to do particular tasks. Which AI model is best at translating to your native language, and how good is it? Which one is best at synthesizing data from the web, or writing frontend code? And so on.
To make matters worse, new and better models are released all the time, and the same model often comes in different sizes with different capabilities. The best you can do is to use AI often, use different models, and for different tasks, to get a good idea of what the models can do. And, again, to abandon the concept of intelligence. It is not useful here.

Knowing yourself

Eventually, you have to confront who you really are. Or at least what you are good at and what you want to get better at. This also means choosing which skills you can afford to let atrophy. Everything you do together with an AI system is a collaborative work to some extent, and you need to choose which parts you want to do yourself. You only have so many hours in the day, and much fewer of them that you can truly focus. Where do you want to spent your limited cognitive resources? For example, do you want to write a text yourself and have the LLM critique it, or do you want to let the LLM write it based on an outline you’ve written? Both paths are possible, but give different results in terms of style and, presumably, quality. One of these paths is much more work than the other. But that same path also results in a text that which is written in your own style, a deeper understanding of what you wrote about, and an opportunity to develop your skills as a writer. Is it worth it to write the first version of the text yourself?
One way of answering that question is to do the work yourself where you provide the most value. It’s a matter of what your comparative advantage is. Is your time better spent writing this text, or doing some other part of the complex work that you are trying to do together with an AI system? But this is tangled up with the question of which of your skills you are proud of, and what you enjoy doing. Maybe you think of yourself as an idea person, but you really enjoy editing text, and you are better at drafting a first version of the text than you are at either coming up with the ideas or doing the edits. The LLM can theoretically do all of these things, but then it’s not your work. If you only have time to do one of them, which one do you choose? Only you can answer this question.
The problem is further complicated by the fact that the way you get good at things is by doing them, and the best way to lose a skill is to not practice it. Handing over your tasks to the AI system means that you lose a chance to get better at doing those tasks. A little bit like how you don’t get any exercise if you drive to work instead of walking, but it does get you there faster.
When you build something complicated, there is also the issue that you only really know how something works if you built it yourself. This is a common pitfall when using AI to build software for you. Initially, you make great progress by “vibe coding”, and it is oh so satisfying to see all that code scrolling by as it is written in response to your requests. You just tell the AI system that you want some functionality, and mere seconds later it is there! However, at some point you run into problems. Some part of your program is not working like it should, and you don’t know why. The LLM doesn’t seem to know either. So you decide to go into the code base yourself–after all, you know how to write code–but you don’t understand it, because you’ve never seen most of it. In extreme cases, you may resort to rewriting it from scratch, so you actually know what’s going on.

What Socrates didn’t know

Expressing yourself clearly, appropriate skepticism, knowing what you want, knowing the other, and knowing yourself. Is this what you need to use AI well? Perhaps, but if so, Socrates would arguably be a master AI user. This seems like an outrageous idea, that could only be dreamt up by someone who was a philosophy student before he became an AI researcher and sometimes wonder whether he should have stuck with philosophy (me).
But let’s take it seriously. Would Socrates be a master prompt whisperer? Maybe. There’s certainly something appealing about Socrates using his eponymous method to coax unknown truths about the world out of unsuspecting language models. And it would be incredibly interesting to see what came out of such an experiment. (Maybe the models have managed to come up with some profound truths in their quest to abstract all the text we have fed them?)
However, I don’t think Socrates would be very effective at using AI in the world we actually live in. Why? Well, because he lived 2500 years ago in a slave-holding iron age society. Socrates famously stated that he knew only that he knew nothing. By modern day standards, he was right. He knew nothing about e.g. finance, software, logistics, aerodynamics, marketing, corporate law, municipal bureaucracy, TikTok, and all the myriad other things we do for fun and profit in the modern world.
And here’s the rub: to express yourself clearly, exercise appropriate skepticism, and know what you want you must know the domain you’re working within. If you don’t, you are not likely produce anything very valuable, with or without AI.
Some people see the huge and growing capabilities of modern AI as a sign of that human knowledge will be less important, perhaps even unimportant, in the future. Why know things, when you can just drink ask the AI to do things for you? The AI knows best, right? But you don’t know what to ask for if you don’t know things. The amount of things that could possibly do at any given time is practically infinite, a fact that is hard to wrap your head around. It is, in general, impossible to know what the optimal thing to do is, even if you know what you want to do. AI systems add agency to us in much the same way as all the other machinery of civilization, from cars to corporations, from light bulbs to libraries. It is more important to know things now than it was in Socrates time, because there are so many more possibilities. I think that knowledge will be even more important in the future.
I think this is true even if the AI system you use knows more than you about whatever you want to do. For example, assume you want to analyze some data. Unless you have a degree in statistics, modern frontier AI models probably know more about statistics than you do. Still, the more you know about statistics, the better you can specify what kind of analysis you want the system to do on your data. You are also more aware of what information can be reliably gotten at all. You will probably also understand the results of the analysis better, and able to refine the analysis better. Crucially, you will also be better prepared to point out when the result is wrong. The more you know, the better, but even just understanding the difference between mean and median helps. Yes, there are people who don’t. Yes, adults.
I think the same argument goes for essentially every other domain you can imagine an AI system helping you with, from fiction writing to airship design to proving mathematical theorems to understanding stale memes. So, go out there and learn things. And use AI a bunch. And think critically.

Monday, December 29, 2025

Making AI Political

It is unavoidable that AI will be a major political issue soon. Or perhaps more appropriately: several major issues. As a technologist, I sympathize with the instinct to try to avoid sullying a fine technology with politics. But in a democratic society we should discuss important things that affect us all, or even just many of us. We need to decide what we should do with or about these things. Create laws and policies. Maybe no laws need to change, but that's also a decision. And society-wide discussion about laws and policies has a name: politics. So let's get political.

One of the most obvious political issues with AI is concentration of power. Large models are very expensive to develop, and the most powerful ones are developed by a handful of companies in the USA and China. This is not an ideal situation if you are not the USA or China, or even if you are not one of these handful of companies. Given the importance of AI, and the extent to which design choices made while developing these models affect all of us, being beholden to these companies is a problem. Luckily, this is something many political ideologies can agree on is a problem. From socialism to liberalism and libertarianism, there is a shared concern about the concentration of power. Granted, these ideologies disagree on who poses the biggest threat (the state or private companies), but they agree on the threat.

One particular set of policies that can mitigate concentration of power revolves around open source AI. This means AI models where at least the model parameters are free for anyone to download, inspect and modify; ideally, the training methods and datasets should also be freely available. This means that anyone can improve them and tailor them to their own use cases. A thousand flowers can bloom. It also means that we can better understand the weird beasts that have become so important to our society and will become much more important still, because anyone can pry them open and look inside. Currently, open-source models are almost as good as closed-source models such as ChatGPT, Claude, and Gemini, but most people (in the West) use closed-source models. We may want to legislate that strong models should be open-sourced. Or, if that is too drastic, we could decide that only open source models that have been properly analyzed by third-party organizations can be used for safety-critical tasks, or in government, or for publicly funded activities. 

Next, let's talk about responsibility. If an AI system helps you build a bomb or plan a murder, or talks you into a suicide or a divorce, or causes a financial crash, or just exposes your personal information to hackers, who is responsible? Mind you, the AI system itself cannot be responsible, because it fears neither death nor taxes and cannot go to jail. Responsibility must come with potential consequences. So, maybe the company that trained the model is responsible? Or the company that served it as an application or web page to you? Or maybe you are responsible, because you were stupid enough to use the system? Or maybe nobody at all is responsible? Court cases touching on these questions are already underway as we speak. But courts just apply and interpret the laws; democratically elected lawmakers make the laws.

There is a whole field of research called Responsible AI that is concerned with these questions. Many results in that field are directly applicable to creating policy. But the policy creation must be informed by principles, and those principles must be put to democratic vote. My sense is that existing ideologies map relatively well onto questions of AI responsibility, where libertarians emphasize individual (end user) responsibility, and socialists emphasize society's responsibility.

A much more thorny knot is intellectual property rights. I know, we discussed intellectual property rights twenty years ago, when Napster and The Pirate Bay were on everyone's lips and on newspaper front pages. Piracy was a scourge to be eradicated, according to large corporations (say, Microsoft) and right-wing commentators. But according to hackers, left-wing activists, and many individual creators, piracy was an expression of freedom and resistance to corporate control. Now, generative AI is on the same lips and front pages. The same large corporations think it is great if they can great their large AI models on everyone else's writings, images, and videos, and that their models can reproduce that content more or less verbatim if prompted right. Meanwhile, left-wing activists, hackers, and individual creators cry foul, and demand to be protected from the large corporations by intellectual property rights. How did we end up here? Maybe it's self-interest and hypocrisy, maybe we are thoroughly confused about intellectual property.

Some would say that getting intellectual property rights right is just a matter of applying existing laws judiciously. But it's very clear that our intellectual property laws are at least two technology cycles behind. We need new laws. And to get them right, we need a society-wide discussion about what should be allowed and who is owed what. Is it okay for me to train my model on your essays and photos without your permission? Is it okay for that model to output something very much like your essays and photos? Does it need to attribute you? Do I, when I share the model’s output? Should you get paid? Who pays, how much, when? Who enforces this? These are difficult questions that do not map readily onto a left-right axis. They also interact with other AI-related political issues. For example, if we demand that model developers license their training data, this likely increases concentration of power, as fewer developers can afford to train models.

The presence of AI systems can be very disruptive to a wide variety of places and situations, from schools to courts, police stations, and municipal offices. AI systems also make powerful surveillance and privacy intrusion possible, not just for governments and companies but also for individual citizens. Should there be restrictions on where AI can be used? Where, and which types of AI? After all, "AI" is a somewhat nebulous cluster of related technologies. Maybe we need to discuss specific examples here. Should you be allowed to wear smart glasses with universal face recognition, that identifies everyone you see and tells you everything that's publicly available about them, or do people have a right to privacy in the public sphere? If your planning permit is denied by the city council, do you have a right to access the model weights of AI model that made the decision, so that you can hand it to an independent investigator for auditing it?

Extrapolating a little, there is the issue of loss of control. What happens if important parts of our society is run by AI systems without effective human control? One might argue that this is already the case to some extent for some financial markets, because no one understands entirely how they function. But financial markets have myriads of actors that are all incentivized to deploy their best systems to trade for them. And in principle, there is human oversight. As AI systems become capable of handling more complex processes in various parts of our society, we should probably make sure to legislate about qualified human oversight as well as mechanisms for avoiding concentration of power.

All of these issues, however important they are on their own, feel like mere preludes to the really big one: labor displacement. A lot of people are worried about their jobs. Terrified, even. If the AI systems can do most or all of what they do, why would someone pay them? Equally importantly, what about their sense of self-worth, of expertise, of contributing to society?

History tells us that technological revolutions destroy many jobs but create equally many other jobs. If you zoom out a little and average over the decades, the unemployment rate has been pretty constant for as long as we have estimates. Most likely, it will be the same this time. Most jobs will transform, some will disappear, but new activities will show up that people are willing to pay other people money for. But are we willing to bet that this will be the case? What if we really risk mass white-collar unemployment? After all, AI is in some sense broader in scope than other revolutionary technologies like railroads or electricity. Or, more likely, what if there will be new jobs, but they are not as fulfilling as the ones that disappeared? You may not love your current job as an accountant, but it sure beats being a dog-walker for the billionaire who owns the data center that runs your life.

There is a belief among some in Silicon Valley that we should simply give everyone Universal Basic Income (UBI), so they can do what they want with their time. This raises a whole host of questions. Who should we tax to get the money for the UBI? Who decides how high it should be? What do people do with their money, or in other words, who do they give it to if everyone else also gets UBI? Beware of Baumol effects here. Who will vote for this policy, and how will the people with all the money be made to respect the votes of those who are not contributing to the economy? One of the reasons democracy (kind of) works is that people can threaten to ground society to a halt by refusing to work. But this requires that people work. Something as radical as UBI would need extensive political discussion before adoption.

It bears repeating: most people want to matter. They want the skills and expertise that they have worked all their life towards to be recognized, and they want to feel that society in some way, however small, depends on them. Take this away from them and they will be very angry.

Views on labor displacement due to AI could be expected to only partly follow a left-right axis. Libertarians would be inclined to just let it happen, while liberals and social democrats would want to mitigate or stop it. But many conservatives would probably side with the center-left because of the perceived threat to human dignity. And some utopian socialists might welcome all of us being unemployed.

Wow, those are some hefty political issues. So why don’t AI researchers and other technologists talk politics all the time? I think the main reason is that they care about technology, and think technology is pure and beautiful whereas politics is dirty and messy and makes people yell at each other. I get it, I really do. And this was a fine attitude to have as long as AI was largely inconsequential. But that is no longer the case.

Some people would argue that we don’t need to involve politics, because we have a whole field of AI Ethics that will start from ethical theories and arrive at engineering solutions. That’s great for research, but no way to run a society. Not a free and democratic society. There is no consensus on ethics, and there never will be. Don’t get me wrong; a lot of useful research has come out of AI Ethics. For example, AI alignment research has produced ingenious methods for understanding and changing the way large AI models behave. But it begs the question what or who these models should be aligned to.

Finally, there are those who think that there is no point in involving politics, because AI progresses so rapidly that there’s nothing we can do about it. There’s no point in trying to steer the Titanic because the iceberg is right in front of us and we can’t turn fast enough. But in fact, we know very little about the iceberg, the ship’s turning radius, the temperature of the water, and even the ship itself. Maybe it can fly? There are myriads of possible outcomes, and no shortage of levers to pull and wheels to turn.

Concretely, there are plenty of political actions that are relatively straightforward, such as mandating human decision-making in various roles, coupled with responsibility for the outcome of processes. This may also come with licensing requirements that make sure that people really understand the processes they are overseeing, and mandatory pentesting of the various human-augmented processes. To guide such policies, you could formulate general principles. For example, that AI should be used to give more people more interesting and meaningful things to work.

You may disagree with much of what I’ve said above. Good. Let’s talk about it. And while we talk about it, let’s spell out our assumptions clearly. Let’s involve lots of different people, not just technologists but economists, sociologists, subject matter experts of all kinds, and, yes, politicians. Because these are matters that concern all of us.

Further reading:

Mandatory open-sourcing

Star Trek, The Culture, and the meaning of life

What is automatable and who is replaceable? Thoughts from my morning commute

AI safety regulation threatens our digital freedoms