Tuesday, May 06, 2025

Write an essay about Julian Togelius

I am well-known enough that most LLMs know about me, but few know me well. I also have a unique name. So one of my go-to tests for new LLMs is to ask them to write an essay about me. It's very enlightening: most of them hallucinate wildly. So far, only Gemini 2.5 Pro (with web search capabilities) gets it mostly (not completely) right.

Even the much-hyped o3, for all its agentic prowess, is very bad at factuality. There's something wrong in every paragraph. Better than an average 7b model, but worse than Llama 70b or Mistral Large. Knowing the subject (myself) intimately is also interesting in that it helps with tracing where the hallucinated "facts" come from. For example, LLMs sometimes claim that I work at the University of Malta (like Georgios Yannakakis) or the University of Central Florida (like Ken Stanley used to do). I guess I'm close to Georgios and Ken in some sort of conceptual space. This exercise is also a sobering counter to Gell-Mann amnesia. If the LLMs get so many things wrong about me, how could I trust them on other somewhat obscure topics?

No comments: