Togelius: Saving peer review from AI slop requires getting rid of anonymous submissions and reviews

The scientific ecosystem is struggling to deal with AI-written papers, and this is a great opportunity to revisit how we publish, where, and why. As many have noticed, a properly prompted modern LLM can produce complete papers that look like real science to qualified scientists in many fields. Yes, really. Whether these papers are actually correct, novel, interesting, insightful, and get their scholarship right will vary depending on the scientific field, the AI model, the observer, the type of paper, and of course the prompter. Better not get into specifics here, especially as the situation is evolving rapidly. The point is that it is now easy to produce what looks like good papers with little human effort.

So, how do publishers, journals, conferences, professional societies, faculty admissions committees and peer reviewers (that is, you and me) deal with this? Not very well.

Scientific publishing is really badly configured for this challenge. For a while now, we have had a movement towards ever larger publication venues and a more anonymous process. I'll talk here about computer science, but I think the winds have been blowing in the same direction in other fields. The largest computer science conferences (such as NeurIPS, CVPR, and AAAI) now have many thousands of published papers at each conference, with tens of thousands of attendees and submissions. Reviewing is at least double-blind: reviewers don't know the identity of the authors, and authors don't know who the reviewers are. The area chairs, who make the first round of decision recommendations, also don't know who the authors are.

This is partly done for reasons of equality and fairness. There is this beautiful idea that anyone from anywhere in the world, from an elite university in Tamil Nadu or a rural high school in Tennessee, with or without powerful mentors, can just submit a paper and have it judged on its merits alone. As we all know, who you know and who knows you matters for your exposure. But the multiple anonymity principle was supposed to counteract this. It was also meant to make it possible to speak truth to power, so that high school student in Tennessee can point out the errors of my ways just as much as the professor in Tamil Nadu.

The concentration of academic publishing into ever larger venues, on the other hand, is probably mostly due to academic bean counting. It's genuinely very hard to gauge the strength of a researcher who is not in your own narrow specialty. But we often have to do that, because we need to decide on hiring and promoting researchers. So we need metrics. Citations are one such metric, though it has many problems, including that it takes time to get cited. Therefore it is common to look at the prestige of a publication venue. Conferences become prestigious by being very large and rejecting most submissions. There are other reasons as well why conferences have ballooned like this; I've written about this phenomenon and my dislike of it before.

The end results of this combination of idealism and economic incentives has led to a breakdown of scientific community. Think about it. Why would you, my peer, review a paper? You don't get any recognition for it, it's not worth mentioning on your CV (unless you're a masters student), and you certainly don't get paid to do it. The authors of the papers you review don't know about the effort you put in, and your parents, friends, spouse, and kids don't care. They just wonder why you are spending your Saturday afternoon tearing apart some paper you don't care about submitted by some people you may never meet instead of hanging out with your loved ones. You say you do it "for the community". Which community? You are just an anonymous cog in the machinery.

This is not a theoretical concern. It has become steadily harder to find reviewers for at least a decade. (I've seen this from a bunch of different angles, including as a reviewer myself, as part of who knows how many committees, and as the editor in chief of a scientific journal of some repute.) In case of the larger conferences, they are basically vacuuming every nook and cranny for anyone remotely competent to review, including bright undergraduates. It is now common to require authors of papers at large conferences to review a number of papers themselves, as a kind of tax for submitting. Obviously, reviewers who are conscripted this way are not highly motivated to do a good job. The consequences of half-assing it are essentially zero. So the temptation is to just ask Gemini or Claude to write the review, changing a few words, and calling it a day and go out and play. Sticking it to the man. But the man, the machine, is the whole scientific system that we carry on our shoulders.

Into this already dysfunctional mess of a system enters a new factor: AI-written papers. En masse. If it's this easy to write papers, you can just bombard the system. Buy many tickets to the lottery. Peer review is so broken that some of them are likely to get through. And you're anonymous. If you get rejected, you never have to reveal your name.

Alright. How can we patch up this sinking ship? We must, because we are on the ship.

First of all, I am not arguing that we should ban the use of AI in the publication process. I think that in the future, most research will be done by humans with the help of an assortment of AI systems. In some cases, the AI systems might have contributed work that would have taken humans extreme amounts of time and effort to do; this is not in principle different from how computer-aided research has been done for decades. The exact amount and character of AI involvement will vary. But for any paper worth publishing, the whole text should have been written (in some revision) by a human, and checked (in its most recent revision) by a human. Here, the general principle of "don't make me read what you didn't write" applies. If the authors of the paper couldn't be bothered to write it, they are not publishing in good faith.

When you think about it, it is kind of odd that peer review works at all. We, and journalists, and to some extent the general public, take the fact that a paper is peer reviewed as a sign that it is true. But when we read the paper, we mostly just believe the statements in it. Sure, we hunt for really bad logic and bad scholarship, but we usually just accept factual statements of the type "algorithm X was faster than algorithm Y, p=0.04". What if the authors just lie to us? In some cases, we can run the code ourselves from an anonymous GitHub, but it's really quite rare that people do that. And running the code rarely answers all the questions. Instead, we just assume that the authors are honest people. Why do we this, again? Because the author are our peers? Are they?

Cue scene of a mortgage broker at Lehman Brothers circa 2007, fresh from buying billions in bad loans, rolling their eyes at the gullibility of the scientific community. Like, these professors and researchers with PhDs just accept the statements of anonymous people on the internet? And we thought they were smart?

What it all comes down to is accountability. A human should be accountable for every piece of research, and stake their name that the research is theirs, produced in good faith, and, as well as they can judge, correct. This should apply at all levels of the publication cycle, at the time of initial submission as well as for the final published paper. As a reviewer, you deserve to know who wrote the paper, because you need to know whether you can trust them. Their name should be a key reason that you trust the paper. And when someone submits bad science, or lies, this should have negative repercussions on their name.

Getting rid of anonymity as the default can help save reviewing as well. Think about it. Why do you review, again? Because of some kind of abstract commitment to the scientific community. But, as we've seen, this is not working very well. What would work is to pay reviewers in the same currency as academics always get paid in: recognition. (Yes, academia is full of narcissists, the same way fields where you get paid real money are full of greedy people.) Simply attach the reviewer's name to the review, so they can brag about it. Even better, they will have an incentive to actually do a good job.

Sure, there would need to be some kind of anonymous review option, so that the child can point out the naked emperor. But it should not be the default. There is also the concern of "review rings", where authors coordinate the boost each others' work. I think those are best fought by shining a light on them. If submissions and reviews are public, people who boost each others' substandard work will look like the fools they are.

All of this relies on the idea that your reviews and paper submissions can have negative consequences as well as positive ones, if they are bad. You may object that this is unlikely as long as we are all atomized participants sampling from an almost infinite width stream of papers. If you see a bad paper with authors you've never heard of, you have no incentive to do anything about it. You'd rather just keep scrolling.

The solution to this is to actually take scientific community seriously. You should be making and defending your name in a community of at most a few hundred people. This is why primary submission venues should be of such size that participants who have attended for a few years can realistically personally know a large proportion of attendees. This is my experience with venues like IEEE Conference on Games, AAAI Artificial Intelligence and Interactive Digital Entertainment, and ACM Foundations of Digital Games. These are the conferences I prefer submitting my work to, and also going to. They usually have between 100 and 300 attendees. Unlike supersized conferences like NeurIPS, ICML, and IJCAI, fun-size conferences allow you to actually get to know the research community. I, like most other repeat participants, have my own opinions of who does research I care about, novel research, high-quality research, and so on. I think this is how it should be. Your own manageably sized research community should be the first port of feedback on and judgement of your research output.

This does not mean that you can't post your work online for anyone in the world to read. On the contrary, I very much think you should. But everyone does this anyway; in 2026, if someone writes a paper and does not submit a preprint of it in some publicly accessible place (such as arXiv, GitHub, or their own website) that's just weird. Maybe a little shady. As if they had something to hide. People should obviously keep posting their work publicly for their world to read, but approval by their own research community should be a strong signal that it is worth reading. And in a research community where people know your name, and you attach your name to your submissions and reviews, you can't get away with bullshit, AI-powered or not. In case the research community gets corrupted and starts letting its members get away with bullshit, the standing of the whole community would drop and people would cease to trust it.

Let's get back to the AI disruption. What if the machines get so good at research that they start producing papers that are actually good and novel? Then it becomes even more important that a human acts as owner and guarantor of the research. As I've (somewhat controversially) argued in the past, it is essential that we retain human control of the scientific process.

More generally, the fact that AI systems are getting better at various tasks with the research process is a good reason to re-examine the role of humans within the research process. And it is becoming ever more clear that research is a long game. AI systems excel at limited-duration tasks that can be clearly specified and evaluated. The rule of humans will increasingly shift into the really thorny stuff, questions without clear answer or evaluation. Such as: what research are you doing and why? Research is a long game. Gemini 3 has a context length of a million tokens, but your context length is your entire career. Your whole life, really, taking into account those childhood experiences that turned you into the weirdo you are, obsessed with whatever obscure questions you care about. In light of this, it's more clear than ever that the individual research paper is not the level at which your research should be judged. So let's make the scientific process more personal, relational, community-based, and human.

Togelius

Sunday, March 01, 2026

Saving peer review from AI slop requires getting rid of anonymous submissions and reviews

No comments:

About Me

Popular Posts

Blog Archive

Followers