Togelius: 2026

Sunday, March 01, 2026

Saving peer review from AI slop requires getting rid of anonymous submissions and reviews

The scientific ecosystem is struggling to deal with AI-written papers, and this is a great opportunity to revisit how we publish, where, and why. As many have noticed, a properly prompted modern LLM can produce complete papers that look like real science to qualified scientists in many fields. Yes, really. Whether these papers are actually correct, novel, interesting, insightful, and get their scholarship right will vary depending on the scientific field, the AI model, the observer, the type of paper, and of course the prompter. Better not get into specifics here, especially as the situation is evolving rapidly. The point is that it is now easy to produce what looks like good papers with little human effort.

So, how do publishers, journals, conferences, professional societies, faculty admissions committees and peer reviewers (that is, you and me) deal with this? Not very well.

Scientific publishing is really badly configured for this challenge. For a while now, we have had a movement towards ever larger publication venues and a more anonymous process. I'll talk here about computer science, but I think the winds have been blowing in the same direction in other fields. The largest computer science conferences (such as NeurIPS, CVPR, and AAAI) now have many thousands of published papers at each conference, with tens of thousands of attendees and submissions. Reviewing is at least double-blind: reviewers don't know the identity of the authors, and authors don't know who the reviewers are. The area chairs, who make the first round of decision recommendations, also don't know who the authors are.

This is partly done for reasons of equality and fairness. There is this beautiful idea that anyone from anywhere in the world, from an elite university in Tamil Nadu or a rural high school in Tennessee, with or without powerful mentors, can just submit a paper and have it judged on its merits alone. As we all know, who you know and who knows you matters for your exposure. But the multiple anonymity principle was supposed to counteract this. It was also meant to make it possible to speak truth to power, so that high school student in Tennessee can point out the errors of my ways just as much as the professor in Tamil Nadu.

The concentration of academic publishing into ever larger venues, on the other hand, is probably mostly due to academic bean counting. It's genuinely very hard to gauge the strength of a researcher who is not in your own narrow specialty. But we often have to do that, because we need to decide on hiring and promoting researchers. So we need metrics. Citations are one such metric, though it has many problems, including that it takes time to get cited. Therefore it is common to look at the prestige of a publication venue. Conferences become prestigious by being very large and rejecting most submissions. There are other reasons as well why conferences have ballooned like this; I've written about this phenomenon and my dislike of it before.

The end results of this combination of idealism and economic incentives has led to a breakdown of scientific community. Think about it. Why would you, my peer, review a paper? You don't get any recognition for it, it's not worth mentioning on your CV (unless you're a masters student), and you certainly don't get paid to do it. The authors of the papers you review don't know about the effort you put in, and your parents, friends, spouse, and kids don't care. They just wonder why you are spending your Saturday afternoon tearing apart some paper you don't care about submitted by some people you may never meet instead of hanging out with your loved ones. You say you do it "for the community". Which community? You are just an anonymous cog in the machinery.

This is not a theoretical concern. It has become steadily harder to find reviewers for at least a decade. (I've seen this from a bunch of different angles, including as a reviewer myself, as part of who knows how many committees, and as the editor in chief of a scientific journal of some repute.) In case of the larger conferences, they are basically vacuuming every nook and cranny for anyone remotely competent to review, including bright undergraduates. It is now common to require authors of papers at large conferences to review a number of papers themselves, as a kind of tax for submitting. Obviously, reviewers who are conscripted this way are not highly motivated to do a good job. The consequences of half-assing it are essentially zero. So the temptation is to just ask Gemini or Claude to write the review, changing a few words, and calling it a day and go out and play. Sticking it to the man. But the man, the machine, is the whole scientific system that we carry on our shoulders.

Into this already dysfunctional mess of a system enters a new factor: AI-written papers. En masse. If it's this easy to write papers, you can just bombard the system. Buy many tickets to the lottery. Peer review is so broken that some of them are likely to get through. And you're anonymous. If you get rejected, you never have to reveal your name.

Alright. How can we patch up this sinking ship? We must, because we are on the ship.

First of all, I am not arguing that we should ban the use of AI in the publication process. I think that in the future, most research will be done by humans with the help of an assortment of AI systems. In some cases, the AI systems might have contributed work that would have taken humans extreme amounts of time and effort to do; this is not in principle different from how computer-aided research has been done for decades. The exact amount and character of AI involvement will vary. But for any paper worth publishing, the whole text should have been written (in some revision) by a human, and checked (in its most recent revision) by a human. Here, the general principle of "don't make me read what you didn't write" applies. If the authors of the paper couldn't be bothered to write it, they are not publishing in good faith.

When you think about it, it is kind of odd that peer review works at all. We, and journalists, and to some extent the general public, take the fact that a paper is peer reviewed as a sign that it is true. But when we read the paper, we mostly just believe the statements in it. Sure, we hunt for really bad logic and bad scholarship, but we usually just accept factual statements of the type "algorithm X was faster than algorithm Y, p=0.04". What if the authors just lie to us? In some cases, we can run the code ourselves from an anonymous GitHub, but it's really quite rare that people do that. And running the code rarely answers all the questions. Instead, we just assume that the authors are honest people. Why do we this, again? Because the author are our peers? Are they?

Cue scene of a mortgage broker at Lehman Brothers circa 2007, fresh from buying billions in bad loans, rolling their eyes at the gullibility of the scientific community. Like, these professors and researchers with PhDs just accept the statements of anonymous people on the internet? And we thought they were smart?

What it all comes down to is accountability. A human should be accountable for every piece of research, and stake their name that the research is theirs, produced in good faith, and, as well as they can judge, correct. This should apply at all levels of the publication cycle, at the time of initial submission as well as for the final published paper. As a reviewer, you deserve to know who wrote the paper, because you need to know whether you can trust them. Their name should be a key reason that you trust the paper. And when someone submits bad science, or lies, this should have negative repercussions on their name.

Getting rid of anonymity as the default can help save reviewing as well. Think about it. Why do you review, again? Because of some kind of abstract commitment to the scientific community. But, as we've seen, this is not working very well. What would work is to pay reviewers in the same currency as academics always get paid in: recognition. (Yes, academia is full of narcissists, the same way fields where you get paid real money are full of greedy people.) Simply attach the reviewer's name to the review, so they can brag about it. Even better, they will have an incentive to actually do a good job.

Sure, there would need to be some kind of anonymous review option, so that the child can point out the naked emperor. But it should not be the default. There is also the concern of "review rings", where authors coordinate the boost each others' work. I think those are best fought by shining a light on them. If submissions and reviews are public, people who boost each others' substandard work will look like the fools they are.

All of this relies on the idea that your reviews and paper submissions can have negative consequences as well as positive ones, if they are bad. You may object that this is unlikely as long as we are all atomized participants sampling from an almost infinite width stream of papers. If you see a bad paper with authors you've never heard of, you have no incentive to do anything about it. You'd rather just keep scrolling.

The solution to this is to actually take scientific community seriously. You should be making and defending your name in a community of at most a few hundred people. This is why primary submission venues should be of such size that participants who have attended for a few years can realistically personally know a large proportion of attendees. This is my experience with venues like IEEE Conference on Games, AAAI Artificial Intelligence and Interactive Digital Entertainment, and ACM Foundations of Digital Games. These are the conferences I prefer submitting my work to, and also going to. They usually have between 100 and 300 attendees. Unlike supersized conferences like NeurIPS, ICML, and IJCAI, fun-size conferences allow you to actually get to know the research community. I, like most other repeat participants, have my own opinions of who does research I care about, novel research, high-quality research, and so on. I think this is how it should be. Your own manageably sized research community should be the first port of feedback on and judgement of your research output.

This does not mean that you can't post your work online for anyone in the world to read. On the contrary, I very much think you should. But everyone does this anyway; in 2026, if someone writes a paper and does not submit a preprint of it in some publicly accessible place (such as arXiv, GitHub, or their own website) that's just weird. Maybe a little shady. As if they had something to hide. People should obviously keep posting their work publicly for their world to read, but approval by their own research community should be a strong signal that it is worth reading. And in a research community where people know your name, and you attach your name to your submissions and reviews, you can't get away with bullshit, AI-powered or not. In case the research community gets corrupted and starts letting its members get away with bullshit, the standing of the whole community would drop and people would cease to trust it.

Let's get back to the AI disruption. What if the machines get so good at research that they start producing papers that are actually good and novel? Then it becomes even more important that a human acts as owner and guarantor of the research. As I've (somewhat controversially) argued in the past, it is essential that we retain human control of the scientific process.

More generally, the fact that AI systems are getting better at various tasks with the research process is a good reason to re-examine the role of humans within the research process. And it is becoming ever more clear that research is a long game. AI systems excel at limited-duration tasks that can be clearly specified and evaluated. The rule of humans will increasingly shift into the really thorny stuff, questions without clear answer or evaluation. Such as: what research are you doing and why? Research is a long game. Gemini 3 has a context length of a million tokens, but your context length is your entire career. Your whole life, really, taking into account those childhood experiences that turned you into the weirdo you are, obsessed with whatever obscure questions you care about. In light of this, it's more clear than ever that the individual research paper is not the level at which your research should be judged. So let's make the scientific process more personal, relational, community-based, and human.

Sunday, February 08, 2026

Math and me

For most of my adult life, I was too cowardly to write this text, never mind posting it. I was worried about what people would think, and the repercussions on my career. Would people still take me seriously? But I’m now a whole full Professor of Computer Science at a top university, with all kinds of fancy metrics and titles to point to. Time to stop being such a pussycat.

Here’s the thing: I’ve always been terrible at math. How bad? Tell me to solve a quadratic equation, or differentiate something, and I would have no idea where to even start. I usually skip right past the equations when I read a paper because I don’t understand them. Last time I proved a theorem was approximately never.

I also always hated math. Not the abstract idea of math, but math as it actually exists. In particular, the activity of doing math, and trying to get stuff right. I hate math because I’m so bad at it, but clearly my negative feelings towards the topic is not helping me get better at math.

I almost failed maths in high school, and all my memories of math class in high school are of me staring out the window, talking to friends, writing weird stories, or programming my calculator. Anything to avoid those detestable math problems. During my undergrad, I had to take an introductory calculus class in order to take some computer science class I wanted to take. I failed the exam for that calculus class four times, and only passed on the fifth try because I realized that one of the professors was reusing his old exams with very minor changes. I learned basically nothing from that course. And not only do I not know how to differentiate anything, I also never learned things such as matrix multiplication or other parts of linear algebra that are supposed to be crucial for AI researchers like me.

Our PhD program requires my PhD students to take some theory courses that I’m pretty sure I couldn’t pass myself. I’m not even sure I could make it through our required undergrad theory courses. Some kind of computer scientist I am. The reason I could get a bachelors degree is that my undergrad is in Philosophy, though I did take a bunch of CS classes.

Which brings us to the question everyone asks, even though they often don’t believe my answer. The question is: how the hell can I be a successful AI researcher without knowing math? The implication is that I’m lying, or at least grossly exaggerating, because we all know that machine learning is very mathematical. It must be, because those GPUs are multiplying matrices all day. I’ll try to answer this below. Please bear with me, I’m trying to be as honest as I can here.

My first instinct is to say that mathematics is not important to the research I do. I never need to prove a theorem or even rewrite an equation. The details of how the matrices get multiplied don’t matter to me. I deal in ideas and code. Not math.

I remember when I taught myself programming using a Turbo Pascal IDE I discovered on the used computer I had bought when I was 13. As I blundered my way through the intricacies of Pascal, mostly by trial and error, I felt that a beautiful new world was opening up to me. It was hard, but I could learn it, and I had talent for it. Writing program code felt pretty much like writing natural language. And I was always good at writing. One of the things I learned about was variables. Some time after that, we were introduced to variables in school. I was excited, as here was a concept I actually knew something about! I was pleasantly surprised that I seemed to understand variables better than anyone else. But this didn’t help with the mind-numbingly boring stuff we did in maths class, all these exercises up and down the page.

In my undergrad, after two years of philosophy and psychology, I started taking computer science classes. I was naturally good at computer science. I understood the concepts and I became a cracked programmer. It was a lot of hard work but that was not a problem, because it was so fun. It was very different to studying philosophy, where I would just read the book and ace the exam. Mathematics, on the other hand, was all hard work and no understanding, and I couldn’t pass the exam at all.

In short, I was good at writing, philosophy, programming, and most aspects of computer science, and saw these subjects as intimately related. At the same time, I was terrible at math. So you may understand how I can see maths as largely unrelated to what I do.

And yet, I often use mathematical concepts when I talk about my research. Actually, when I do research as well. A recent project of ours focuses on embedding programs represented as syntax trees into a latent space that can the be searched efficiently. This involves considerations such as keeping the dimensionality of the space low enough to allow covariance matrix calculation and how to regularize the search to stay within the training distribution. That’s a bunch of mathematical terms there. And they mean something, because reasoning with them is how we got the method to work so well. But please don’t ask me to write down the equations.

So, how do I reason with mathematical concepts if I cannot do the symbol manipulation? Mostly visually. There are these little images of these things going on in my head, like a search blob moving against a gradient in a latent space. The images are somehow incomplete and clearly misleading–it is impossible to visualize a 128-dimensional space, so you have think of it as two-dimensional–but they are useful. But I also sometimes think of them in terms of program code, and the program code often comes out as animations, e.g. I see the program counter looping in a for-loop. It’s not clear to me how being able to to do the symbol manipulation (e.g. rewriting the equation for for the encoder function in some other form) would be of any help in reasoning about the algorithm. But that might just be because I don’t know how to do the symbol manipulation. If I did, maybe I would see new possibilities.

There are other uses of mathematical concepts which are possibly even fuzzier. A key skill in designing algorithms is understanding approximately how they scale in time and space. This basically boils down to figuring out what operations take time and which data takes space, and then having a mental picture of how many of them there are. Quite often, you’re counting loops. I learned the basics of doing this formally back in undergrad, but I haven’t done a formal analysis of an algorithm since. But I do loose, very informal analyses a lot when thinking and talking about algorithms. They help. But please don’t ask me to write them down.

Could it have been different? Could I have become the kind of person who was genuinely good at maths, enjoyed it, and perhaps even published papers with mathematical results of my own? Who knows. The closest I ever came to thinking I understood math was during a discrete maths course in my undergrad, which I found myself actually enjoying, although it was a lot of work. For a little while I felt like math might actually be for me. I’m not sure if this was because of the topic, as discrete maths felt discontinuous with all the continuous maths I’d learned to not learn so far. Maybe it was mainly my very inspirational teacher, Thore Husfeldt. In either case, the feeling dissipated as soon as I encountered that analysis class, the one that I failed four times.

As I write this, I keep fighting the impulse to brag about how successful a researcher I am. “Trust me, I’m a good researcher even though I don’t know math, see, I published so-and-so many papers and got so-and-so many citations and won this-and-that award.” I hate being that guy. So I’ll keep fighting that impulse. But it speaks to how deeply the impostor syndrome has taken root. Enough people have told me that I cannot possibly do what I’m doing without knowing a lot of math so that I’ve somehow think I can’t do what I do.

If you’ve read this far, you may wonder where I’m going. Who am I writing this for, and what am I trying to say? Let’s discuss some alternatives.

I’m definitely not saying that you shouldn’t study math. If you like mathematics, go ahead and study it. It’s useful (I know) and beautiful (they say). I have a lot of respect for theoreticians and wish I could do what they do.

Another thing I don’t want to do is to blame my teachers. Maybe it was my teachers who taught me that math was boring and that I was bad at it. Maybe it was their curriculum they had to follow. Maybe it was me. Other people seemed to enjoy those same math lessons, after all. Dear teachers, thank you for trying to teach me; I don’t think you and I were good fits for each other, but that’s not your fault.

More likely, I’m writing this for those of my colleagues who are in the same boat as me, who somehow became successful computer scientists despite sucking at math. I’m like you, guys. We exist. I also write it for those of my colleagues who actually do know a lot of math, to explain how I work.

But I also write it for myself, because I genuinely don’t understand. Do I actually know a decent amount of math? I use those concepts all the time. But I certainly can’t solve any exercise problems. What does it mean to know math, anyway? I think the idea that you need to start from the basics and solve all those boring exercises to even learn about the more interesting concepts is male-cow-excrement. Or maybe that is one way of approaching mathematics, but far from the only one.

Most of all, I write for those who have been thinking of learning computer science, but are afraid to try because they don’t like math or are bad at it. You can certainly do it. You can become a very good computer scientist despite sucking at math. If anyone tells you that you can’t learn, say, machine learning because you don’t have the “mathematical fundamentals” tell them to go to Helsinki. In the winter.

There are some strong feelings involved here, and I should perhaps stop writing now before I get more explicit. And I should post this before I go back and re-read it and start toning it down. Better post it fresh and raw, like sushi.

Tuesday, January 27, 2026

What does it mean to be good at using AI?

They say we should educate people about AI, because we all need to get good at using AI. But what does it mean to be “good at using AI”? I’m not sure. Understanding the technical underpinnings of modern AI models only helps a little bit; I’ve done AI research for 20 years and I’m not sure I’m a particularly skilled user of AI. But here are my two cents, and 2800 words.

It seems to me that there are no magic bullets for efficient AI use. In the recent past there were various incantations you could use that would somewhat mysteriously get you better results, such as telling the model to “think step by step”. Alas, such incantations matter less these days. In general, language models and their associated systems are good at understanding what you tell them, and they improve rapidly.

So what is there to learn? I think the best way to get good at using these beasts is to use them a lot, and try to vary how you use them. I’ve been trying to think of what the main challenges are when using modern LLMs as I’ve interacted with them. Here are some main skills I think you need, in increasing order of technical and existential difficulty.

Expressing yourself clearly

However capable the model is, it doesn’t live inside your head and can’t read your thoughts. You need to tell it what you want from it. You can also not assume that it has the context of everything you’ve experienced in your life. It most likely doesn’t even have the context of the situation you are in right now. Stating what you want clearly is a transferable skill. It is more or less the same skill you need for outsourcing work to a contractor or explaining an assignment to your students. Not everyone is good at it; I have seen many professor colleagues give woefully incomplete or ambiguous specifications to students, for example. Sometimes, I’ve done so myself.

Elucidating your intent via dialogue is useful, but could also lead you astray. It is very useful for the student, contractor, or language model to be able to ask follow-up questions. These may in turn spur you to think of aspects of your original request that you did not think about. You may even understand what you wanted better. However, the follow-up question may also end up leading you in a completely different direction; notice how often an LLM helpfully asks “would you want me to…?”. Expressing what you want clearly from the start is how you actually get the answer you want. And clarity of expression requires clarity of thought.

Appropriate skepticism

Language models are not inherently truthful. At their core, they produce probable tokens. In other words, they produce true-sounding bullshit. In the early days, this meant that you couldn’t really trust anything they said. These days, great strides have been made to reduce confabulations (a.k.a. hallucinations), and if you ask a good language model about something widely known, you can generally trust the answer. In other words, the bullshit is very often true and useful.

A key reason that language models have become more truthful is that they look things up on the web. Basically, they do the same thing as you would: they google things when they don’t know. To understand how important this is, try using a state-of-the-art language model with web search turned off (this is is possible for example with Claude, or if you have a beefy computer that can run good models locally). If web search is turned off and you ask the model about a niche topic that you know well, chances are that it will bullshit worse than a drunk politician.

Now you may wonder, if web search is turned on, do you still need to be skeptical? Yes. Because, as you may have noticed, not everything on the internet is true. And LLMs are gullible.

To understand this better, have an LLM with web search turned on compile a report, complete with sources, for you on a subject you know well. All the leading model providers have “deep research” functions that do this. You will likely find that the referenced material is all over the place: peer-reviewed papers, news articles, forum discussions, even marketing material. It is often hard to know who to trust, and the task is not easier for a language model. It doesn’t matter how advanced the neural network is, it does not magically know things. There is no escaping epistemology. For you, the user, finding out who to trust just got harder, because now the disparate sources are filtered through the same model and presented to you with the same authoritative voice.

A relevant concept here is Gell-Mann amnesia, a concept introduced by Michael Crichton. Yes, the author of Jurassic Park. Gell-Mann amnesia refers to how you forget to doubt statements outside of your area of expertise when they are presented by a source you consider authoritative. Crichton takes the example of reading about the movie industry in a newspaper, and complaining about how the journalists get everything wrong. He would then turn the page to read about something completely different, for example particle physics, and unquestionably accept what he read. But why would the journalists be better at writing about particle physics than they are at writing about the movie industry? Now think back to your experience of asking the LLM something on a topic you know deeply. And then asking the same LLM about something in a topic you don’t know.

Personally, I trust what comes out of a good LLM about as much as I trust what I read in a tabloid newspaper or what I see on TV. Or perhaps as much as I trust a peer-reviewed paper in a venue with loose standards. All of these are useful sources of information, but require skepticism. And exercising appropriate skepticism on a topic you don’t know well is hard.

Knowing what you want

With great power comes the question of what to do with it. LLMs give you great power. At least within certain domains. You know that feeling when open the fridge and just stare at the food inside, not remembering why you went to the fridge in the first place? That’s me, in front of Gemini or Claude, sometimes.

At any given point in time, there’s an infinite number of things you could possibly do. There’s an infinite number of questions you an ask, apps you can build, analyses to run, and so on. Most of them are not what you should be doing right now. In theory, if you always chose the best possible action you could take in order to maximize your overall objective, you would be much more successful than you are right now. But most of the time you don’t think deeply about what to do or ask next, because that would be absolutely exhausting.

Let’s say that you come to your AI tools intentionally, with a concrete task to do. You want to write a text, analyze some data, understand a paper or perhaps create an app. Where do you start? You could simply put in the overall idea into the prompt, something like “help me understand this paper” or “build an app that balances my household budget”. Very likely you will get a result other than what you wanted. This is because any complex request hides a myriad of small design decisions. Either you make those decisions, or the model will make them for you. If the model makes them, it will probably choose very generic alternatives. So you will want to provide lots of details, and likely break down the task in many steps. This, in turn, requires that you actually know what you want to do with the AI system. Not just understanding a paper or building a budget app, but which part of the paper you want to understand and in what terms you want it explained, or which features you want in the budget app and what the interface should be like. Choices choices choices. Making all of those choices is hard work, but it’s your work.

Knowing the other

There is a tendency to look at what an AI model does best, and think that that is how “intelligent” it is. But your view of intelligence is always relative to some implicit idea you have of what a human can and can’t do. But that is not how AI works. Whatever an LLM is, it is not a human, and does not have a human-like distribution of skills.

The very same LLM that knows more than any human has ever known and writes working software from scratch can entirely lack spatial intuition, make ridiculous errors in image generation and have the memory of a goldfish, forgetting the start of your conversation. It is very confusing, because your intuitive notion of intelligence keeps intruding and insisting that if someone is good at A, they should also be good at B, like a human would be. The sensible thing to do is to forget, or at least put aside, your notion of intelligence and start keeping track of capabilities to do particular tasks. Which AI model is best at translating to your native language, and how good is it? Which one is best at synthesizing data from the web, or writing frontend code? And so on.

To make matters worse, new and better models are released all the time, and the same model often comes in different sizes with different capabilities. The best you can do is to use AI often, use different models, and for different tasks, to get a good idea of what the models can do. And, again, to abandon the concept of intelligence. It is not useful here.

Knowing yourself

Eventually, you have to confront who you really are. Or at least what you are good at and what you want to get better at. This also means choosing which skills you can afford to let atrophy. Everything you do together with an AI system is a collaborative work to some extent, and you need to choose which parts you want to do yourself. You only have so many hours in the day, and much fewer of them that you can truly focus. Where do you want to spent your limited cognitive resources? For example, do you want to write a text yourself and have the LLM critique it, or do you want to let the LLM write it based on an outline you’ve written? Both paths are possible, but give different results in terms of style and, presumably, quality. One of these paths is much more work than the other. But that same path also results in a text that which is written in your own style, a deeper understanding of what you wrote about, and an opportunity to develop your skills as a writer. Is it worth it to write the first version of the text yourself?

One way of answering that question is to do the work yourself where you provide the most value. It’s a matter of what your comparative advantage is. Is your time better spent writing this text, or doing some other part of the complex work that you are trying to do together with an AI system? But this is tangled up with the question of which of your skills you are proud of, and what you enjoy doing. Maybe you think of yourself as an idea person, but you really enjoy editing text, and you are better at drafting a first version of the text than you are at either coming up with the ideas or doing the edits. The LLM can theoretically do all of these things, but then it’s not your work. If you only have time to do one of them, which one do you choose? Only you can answer this question.

The problem is further complicated by the fact that the way you get good at things is by doing them, and the best way to lose a skill is to not practice it. Handing over your tasks to the AI system means that you lose a chance to get better at doing those tasks. A little bit like how you don’t get any exercise if you drive to work instead of walking, but it does get you there faster.

When you build something complicated, there is also the issue that you only really know how something works if you built it yourself. This is a common pitfall when using AI to build software for you. Initially, you make great progress by “vibe coding”, and it is oh so satisfying to see all that code scrolling by as it is written in response to your requests. You just tell the AI system that you want some functionality, and mere seconds later it is there! However, at some point you run into problems. Some part of your program is not working like it should, and you don’t know why. The LLM doesn’t seem to know either. So you decide to go into the code base yourself–after all, you know how to write code–but you don’t understand it, because you’ve never seen most of it. In extreme cases, you may resort to rewriting it from scratch, so you actually know what’s going on.

What Socrates didn’t know

Expressing yourself clearly, appropriate skepticism, knowing what you want, knowing the other, and knowing yourself. Is this what you need to use AI well? Perhaps, but if so, Socrates would arguably be a master AI user. This seems like an outrageous idea, that could only be dreamt up by someone who was a philosophy student before he became an AI researcher and sometimes wonder whether he should have stuck with philosophy (me).

But let’s take it seriously. Would Socrates be a master prompt whisperer? Maybe. There’s certainly something appealing about Socrates using his eponymous method to coax unknown truths about the world out of unsuspecting language models. And it would be incredibly interesting to see what came out of such an experiment. (Maybe the models have managed to come up with some profound truths in their quest to abstract all the text we have fed them?)

However, I don’t think Socrates would be very effective at using AI in the world we actually live in. Why? Well, because he lived 2500 years ago in a slave-holding iron age society. Socrates famously stated that he knew only that he knew nothing. By modern day standards, he was right. He knew nothing about e.g. finance, software, logistics, aerodynamics, marketing, corporate law, municipal bureaucracy, TikTok, and all the myriad other things we do for fun and profit in the modern world.

And here’s the rub: to express yourself clearly, exercise appropriate skepticism, and know what you want you must know the domain you’re working within. If you don’t, you are not likely produce anything very valuable, with or without AI.

Some people see the huge and growing capabilities of modern AI as a sign of that human knowledge will be less important, perhaps even unimportant, in the future. Why know things, when you can just drink ask the AI to do things for you? The AI knows best, right? But you don’t know what to ask for if you don’t know things. The amount of things that could possibly do at any given time is practically infinite, a fact that is hard to wrap your head around. It is, in general, impossible to know what the optimal thing to do is, even if you know what you want to do. AI systems add agency to us in much the same way as all the other machinery of civilization, from cars to corporations, from light bulbs to libraries. It is more important to know things now than it was in Socrates time, because there are so many more possibilities. I think that knowledge will be even more important in the future.

I think this is true even if the AI system you use knows more than you about whatever you want to do. For example, assume you want to analyze some data. Unless you have a degree in statistics, modern frontier AI models probably know more about statistics than you do. Still, the more you know about statistics, the better you can specify what kind of analysis you want the system to do on your data. You are also more aware of what information can be reliably gotten at all. You will probably also understand the results of the analysis better, and able to refine the analysis better. Crucially, you will also be better prepared to point out when the result is wrong. The more you know, the better, but even just understanding the difference between mean and median helps. Yes, there are people who don’t. Yes, adults.

I think the same argument goes for essentially every other domain you can imagine an AI system helping you with, from fiction writing to airship design to proving mathematical theorems to understanding stale memes. So, go out there and learn things. And use AI a bunch. And think critically.