So you've built an artificial intelligence, or some kind of learning system. How do you know if it's any good? Well, you test it. And what do you test it on? You let if play games of course! Games are excellent testbeds, because they were made to test human thinking skills of various kinds - in fact, they are fun to play largely because they exercise your brain so well - so therefore they will test adequate aspects of your AI's intelligence. Also, games are excellent testbeds because they execute much faster than robots, don't need any expensive hardware, can be understood by everyone and are fun to both watch and participate. In fact, it is hard to understand why you would ever use an AI testbed that was not a game. To make things easier for you, there are now a number of competitions running at the major AI/games conferences, which allow you to submit your AI agents and tests them on their capacity to play games such as car racing, Super Mario Bros, Othello, StarCraft, Ms. Pac-Man, Go or Unreal Tournament.
Alright, great! Sounds like we have AI benchmarking all sorted, right?
But hold up! There's something wrong with the picture I paint above. And it's not that games are excellent testbeds; they truly are. And I am certainly not meaning to demean the set of game-based AI competitions on offer (how could I, having started some of them myself?).
Instead, what's wrong with the picture is the idea that you first build an AI and then test it on some game. More likely, a researcher will look at the game, play it for a bit, and try to imagine how it could be played by a computer program. The researcher will then try to build an AI with the game in mind, and tune it to play this game specifically. The result, in many cases, is a piece of software that plays a game very well (in many cases better than the researcher who built it) - but who might not be good at anything else.
So what do we learn about artificial intelligence - in general - from creating software that is good at playing a particular game? How could you design game-based competitions that make you learn more about AI? In fact, how could you design game-based competitions that are easy and fun to participate in yet are scientifically meaningful? And how could you write up the results of these competitions so that they add to the sum of scientific literature, while giving everybody who worked on the project due credit? Because running, and participating in, game-based AI competitions is a form of AI research too. It's just one we have not discussed very much yet from a methodological or epistemological standpoint, or even from a standpoint of giving researchers credit for their work.
I explore these issues in a new article in TCIAIG; the article is available here (or use this direct link if you don't have access to IEEE Xplore). The article is based on my tutorial at CIG 2013, and the comments I received afterwards; this tutorial is in turn based on my experience of running a number of these competitions.
Also, the article can be seen as a travesty of a particular "literary" genre. See if you can find all the genre tropes!