Togelius: March 2026

I read things, write things, and talk to people. The proportions vary, but that’s essentially what I do. Or rather: those are my observable activities. I also think. The thinking often happens when I read, write, or talk, but also when I walk, drive, or take the subway or elevator. And when I shower! That exclamation mark! I should shower more.

Once upon a time I also programmed. I even considered that my craft, on par with writing. The last time I wrote non-trivial code was 2015. Long before that, I’d stopped keeping up with modern toolchains and software development practices, or even languages. That’s actually partly why I stopped programming: nobody uses Java for AI research or SVN for version control, and I think Python is unacceptably sloppy and git is incomprehensible.

Of course, the main reason I don’t program anymore is that I’m busy reading, writing, and talking. And thinking. I enjoy those things more. It’s not that I didn’t like programming: I enjoyed it a lot. And I was quite good at it. But there are lots of enjoyable activities you don’t easily find time for when you have two jobs and two kids. Even activities you’re good at.

Before I programmed, I took things apart. First, my toys. My room was full of useless thingamagogs that had once been part of fully functioning toys. I was no good at putting them back together again, or I didn’t have the patience. Or the interest. At some point, I graduated to computers, and built various PCs from parts I bought cheaply from flea markets or badgered my mom’s friends to give me. I destroyed a lot of those parts in the process. It was a lot of fun.

The PC-XT clone I bought with the proceeds from my first summer job (as a gardener) when I was 13 had a Turbo Pascal IDE on its 20 Mb hard drive. I decided to learn to program so I could make games. I copied and pasted things and tried to figure out what worked through trial and error. I learned a thing or two. Later on, I also spent a lot of time composing music on a 486 I built myself, and learned the basics of website building on the same machine. I never even fastened the hard drive to the chassis, and the computer had blinking lights and some kind of glitch so that you might get an electric shock from touching it.

These days, I don’t want to see the insides of my computers. I use Macs, and I want them pristine. No stickers, clean desktop, and no unnecessary applications. As few customizations as possible. It’s like I’m not even interested in computers anymore.

In sum, I’m a bad computer user. I do not let my computers fulfill their potential. Basically, I use the computer for reading and writing. Anything I actually use my computer for could be done on a 20 year old machine. If it could connect to the internet, I could do what I do on a 40 year old computer.

Yet, I keep buying new computers. I happily hand over my employers’ money to Apple in exchange for swanky new gear with waaaay more power than I need. And I don’t feel bad about it. I tell myself that I need an M5 Max with max memory so I can run local LLMs, and that is in fact a minor hobby of mine, but not really important to my actual work. Most of the time I use my computer for reading and writing emails, or reading papers or web pages, or having Zoom calls. My jacked monster of a swole M5 processor must be really bored.

I think I like computers mostly for aesthetic reasons. I’m like a rich old man who buys a Ferrari only to drive it around town and never exceed the speed limit. I just want to hear the menacing growl of the V8 and admire those aerodynamic lines. Except I’m not rich, and not that old, so I buy computers instead.

I’ve been thinking about this recently because computers are finally learning to use computers. Fat harnesses around frontier models help them navigate various applications, and this means you can increasingly just ask your computer to do things for you. Language models can also write code really well now, so you can (sometimes) conjure functioning new software just by calling it by its true name. Thus, it’s all the rage to make your AI agents do things for you. Writing code, reading reports, answering emails, other computer things.

Some seem to want to automate all of their digital life. Some seem to think it’s a good idea to install OpenClaw and give it root access to their computer and logins to all their computers. These kinds of people remind me of myself when I was 16, deeply into building weird things that rarely worked just for the sake of it, customizing every piece of software and interface because it’s cool, and caring not for safety nor security. I try to keep in mind that I was also once like that, because that allows me to understand these people. They just love technology in the way I once did.

Anyway. I am allegedly an “AI researcher”, a type of “computer scientist", and this comes, I think, with the obligation to at least occasionally act like one. So I try to use all these frontier models like I was Buffalo Bill. Often, it involves looking hard for some need I barely have that might be satisfied by a language model. This task is getting harder and harder. For what do I actually need? What should I use these things for?

A friend of mine suggested I vibe-code some unique software just for me. What kind of software, I asked. He said he had made some software for himself that keeps track of his exercise routine just the way he wants it. But I don’t want that! The point of going to the gym is to not have to care about such things, and instead put on the headphones and zone out while incinerating calories and letting the mind wander. Also, I don’t want to have to take care of maintaining a piece of software, even if it’s just for myself. Unnecessary stress. In fact, I want less software, not more. There’s far too much software in the world already. For any given need, there’s probably an app for that already, but I don’t want to have to look for it and I don’t want to install even more apps. That my local lunch restaurant has its own app and pesters me to install it is proof that too much software is being written.

What else could I have the models do for me? Write for me? But the whole point of my writing is that it’s mine. It’s not so much that it would be immoral to put my own name to something an LLM wrote (it certainly would), but that it doesn’t even make sense. It just wouldn’t be my writing. Please don’t tell me I need to explain this to you.

Could I have the models think for me? But I thought we already established that I am in this job because I like thinking. If you want to avoid thinking, you should not become an academic. Imagine going to a restaurant, ordering food, and then paying extra for the waiter to eat the food as well. That’s right, eating the food yourself is kind of the point. (My original metaphor was more striking, but this is a family-friendly blog.)

The more I think about it, the more the advent of AI agents has made me realize that I’m not much of a computer user. I don’t care for the vast majority of things you could make a computer do, and I don’t want to bother with new software if I can avoid it. Please don’t bother me with your buzzwordladen productivity catalyst. Give me a text editor and shut up already.

Maybe I would rather not even use computers. The computers can use themselves now, so maybe we can get on with our lives? Computers aren’t real anyway. So let me valet my shiny laptop. What I really need is some good books, some good friends, and a typewriter. And some good wine. So I can read, write, talk. And think.

The scientific ecosystem is struggling to deal with AI-written papers, and this is a great opportunity to revisit how we publish, where, and why. As many have noticed, a properly prompted modern LLM can produce complete papers that look like real science to qualified scientists in many fields. Yes, really. Whether these papers are actually correct, novel, interesting, insightful, and get their scholarship right will vary depending on the scientific field, the AI model, the observer, the type of paper, and of course the prompter. Better not get into specifics here, especially as the situation is evolving rapidly. The point is that it is now easy to produce what looks like good papers with little human effort.

So, how do publishers, journals, conferences, professional societies, faculty admissions committees and peer reviewers (that is, you and me) deal with this? Not very well.

Scientific publishing is really badly configured for this challenge. For a while now, we have had a movement towards ever larger publication venues and a more anonymous process. I'll talk here about computer science, but I think the winds have been blowing in the same direction in other fields. The largest computer science conferences (such as NeurIPS, CVPR, and AAAI) now have many thousands of published papers at each conference, with tens of thousands of attendees and submissions. Reviewing is at least double-blind: reviewers don't know the identity of the authors, and authors don't know who the reviewers are. The area chairs, who make the first round of decision recommendations, also don't know who the authors are.

This is partly done for reasons of equality and fairness. There is this beautiful idea that anyone from anywhere in the world, from an elite university in Tamil Nadu or a rural high school in Tennessee, with or without powerful mentors, can just submit a paper and have it judged on its merits alone. As we all know, who you know and who knows you matters for your exposure. But the multiple anonymity principle was supposed to counteract this. It was also meant to make it possible to speak truth to power, so that high school student in Tennessee can point out the errors of my ways just as much as the professor in Tamil Nadu.

The concentration of academic publishing into ever larger venues, on the other hand, is probably mostly due to academic bean counting. It's genuinely very hard to gauge the strength of a researcher who is not in your own narrow specialty. But we often have to do that, because we need to decide on hiring and promoting researchers. So we need metrics. Citations are one such metric, though it has many problems, including that it takes time to get cited. Therefore it is common to look at the prestige of a publication venue. Conferences become prestigious by being very large and rejecting most submissions. There are other reasons as well why conferences have ballooned like this; I've written about this phenomenon and my dislike of it before.

The end results of this combination of idealism and economic incentives has led to a breakdown of scientific community. Think about it. Why would you, my peer, review a paper? You don't get any recognition for it, it's not worth mentioning on your CV (unless you're a masters student), and you certainly don't get paid to do it. The authors of the papers you review don't know about the effort you put in, and your parents, friends, spouse, and kids don't care. They just wonder why you are spending your Saturday afternoon tearing apart some paper you don't care about submitted by some people you may never meet instead of hanging out with your loved ones. You say you do it "for the community". Which community? You are just an anonymous cog in the machinery.

This is not a theoretical concern. It has become steadily harder to find reviewers for at least a decade. (I've seen this from a bunch of different angles, including as a reviewer myself, as part of who knows how many committees, and as the editor in chief of a scientific journal of some repute.) In case of the larger conferences, they are basically vacuuming every nook and cranny for anyone remotely competent to review, including bright undergraduates. It is now common to require authors of papers at large conferences to review a number of papers themselves, as a kind of tax for submitting. Obviously, reviewers who are conscripted this way are not highly motivated to do a good job. The consequences of half-assing it are essentially zero. So the temptation is to just ask Gemini or Claude to write the review, changing a few words, and calling it a day and go out and play. Sticking it to the man. But the man, the machine, is the whole scientific system that we carry on our shoulders.

Into this already dysfunctional mess of a system enters a new factor: AI-written papers. En masse. If it's this easy to write papers, you can just bombard the system. Buy many tickets to the lottery. Peer review is so broken that some of them are likely to get through. And you're anonymous. If you get rejected, you never have to reveal your name.

Alright. How can we patch up this sinking ship? We must, because we are on the ship.

First of all, I am not arguing that we should ban the use of AI in the publication process. I think that in the future, most research will be done by humans with the help of an assortment of AI systems. In some cases, the AI systems might have contributed work that would have taken humans extreme amounts of time and effort to do; this is not in principle different from how computer-aided research has been done for decades. The exact amount and character of AI involvement will vary. But for any paper worth publishing, the whole text should have been written (in some revision) by a human, and checked (in its most recent revision) by a human. Here, the general principle of "don't make me read what you didn't write" applies. If the authors of the paper couldn't be bothered to write it, they are not publishing in good faith.

When you think about it, it is kind of odd that peer review works at all. We, and journalists, and to some extent the general public, take the fact that a paper is peer reviewed as a sign that it is true. But when we read the paper, we mostly just believe the statements in it. Sure, we hunt for really bad logic and bad scholarship, but we usually just accept factual statements of the type "algorithm X was faster than algorithm Y, p=0.04". What if the authors just lie to us? In some cases, we can run the code ourselves from an anonymous GitHub, but it's really quite rare that people do that. And running the code rarely answers all the questions. Instead, we just assume that the authors are honest people. Why do we this, again? Because the author are our peers? Are they?

Cue scene of a mortgage broker at Lehman Brothers circa 2007, fresh from buying billions in bad loans, rolling their eyes at the gullibility of the scientific community. Like, these professors and researchers with PhDs just accept the statements of anonymous people on the internet? And we thought they were smart?

What it all comes down to is accountability. A human should be accountable for every piece of research, and stake their name that the research is theirs, produced in good faith, and, as well as they can judge, correct. This should apply at all levels of the publication cycle, at the time of initial submission as well as for the final published paper. As a reviewer, you deserve to know who wrote the paper, because you need to know whether you can trust them. Their name should be a key reason that you trust the paper. And when someone submits bad science, or lies, this should have negative repercussions on their name.

Getting rid of anonymity as the default can help save reviewing as well. Think about it. Why do you review, again? Because of some kind of abstract commitment to the scientific community. But, as we've seen, this is not working very well. What would work is to pay reviewers in the same currency as academics always get paid in: recognition. (Yes, academia is full of narcissists, the same way fields where you get paid real money are full of greedy people.) Simply attach the reviewer's name to the review, so they can brag about it. Even better, they will have an incentive to actually do a good job.

Sure, there would need to be some kind of anonymous review option, so that the child can point out the naked emperor. But it should not be the default. There is also the concern of "review rings", where authors coordinate the boost each others' work. I think those are best fought by shining a light on them. If submissions and reviews are public, people who boost each others' substandard work will look like the fools they are.

All of this relies on the idea that your reviews and paper submissions can have negative consequences as well as positive ones, if they are bad. You may object that this is unlikely as long as we are all atomized participants sampling from an almost infinite width stream of papers. If you see a bad paper with authors you've never heard of, you have no incentive to do anything about it. You'd rather just keep scrolling.

The solution to this is to actually take scientific community seriously. You should be making and defending your name in a community of at most a few hundred people. This is why primary submission venues should be of such size that participants who have attended for a few years can realistically personally know a large proportion of attendees. This is my experience with venues like IEEE Conference on Games, AAAI Artificial Intelligence and Interactive Digital Entertainment, and ACM Foundations of Digital Games. These are the conferences I prefer submitting my work to, and also going to. They usually have between 100 and 300 attendees. Unlike supersized conferences like NeurIPS, ICML, and IJCAI, fun-size conferences allow you to actually get to know the research community. I, like most other repeat participants, have my own opinions of who does research I care about, novel research, high-quality research, and so on. I think this is how it should be. Your own manageably sized research community should be the first port of feedback on and judgement of your research output.

This does not mean that you can't post your work online for anyone in the world to read. On the contrary, I very much think you should. But everyone does this anyway; in 2026, if someone writes a paper and does not submit a preprint of it in some publicly accessible place (such as arXiv, GitHub, or their own website) that's just weird. Maybe a little shady. As if they had something to hide. People should obviously keep posting their work publicly for their world to read, but approval by their own research community should be a strong signal that it is worth reading. And in a research community where people know your name, and you attach your name to your submissions and reviews, you can't get away with bullshit, AI-powered or not. In case the research community gets corrupted and starts letting its members get away with bullshit, the standing of the whole community would drop and people would cease to trust it.

Let's get back to the AI disruption. What if the machines get so good at research that they start producing papers that are actually good and novel? Then it becomes even more important that a human acts as owner and guarantor of the research. As I've (somewhat controversially) argued in the past, it is essential that we retain human control of the scientific process.

More generally, the fact that AI systems are getting better at various tasks with the research process is a good reason to re-examine the role of humans within the research process. And it is becoming ever more clear that research is a long game. AI systems excel at limited-duration tasks that can be clearly specified and evaluated. The rule of humans will increasingly shift into the really thorny stuff, questions without clear answer or evaluation. Such as: what research are you doing and why? Research is a long game. Gemini 3 has a context length of a million tokens, but your context length is your entire career. Your whole life, really, taking into account those childhood experiences that turned you into the weirdo you are, obsessed with whatever obscure questions you care about. In light of this, it's more clear than ever that the individual research paper is not the level at which your research should be judged. So let's make the scientific process more personal, relational, community-based, and human.

Togelius

Sunday, March 22, 2026

Computers and me

Sunday, March 01, 2026

Saving peer review from AI slop requires getting rid of anonymous submissions and reviews

About Me

Popular Posts

Blog Archive

Followers