Sunday, December 28, 2008

Automatic Game Design

One of the papers I presented at the recent CIG conference is called "An Experiment in Automatic Game Design". Designing games automatically, what's that all about? I thought I'd take a blog post to explain the main ideas in the paper (and it really is mostly a proof-of-concept paper).

What we're trying to do is search a space of game rules for rule sets that constitute fun games. This immediately raises two questions: how do you define and search a space of game rules, and how can you measure whether a game is fun?

We decided to create set of "meta-rules" or "axioms" that define a space of grid-based games, where one of the games that could be created would be a simple version of Pac-man. The game arena ("maze") is very simple (just a few walls) and is the same for all games. The player controls an agent (the purple blob in the figure below) and can move one block up, down, left or right every time step. Apart from the agent, there are a number of red, green and blue "things" in a game. They are deliberately called "things" and not opponents, food, mines, collaborators etc. because their exact relation to the player is decided by the rules of the game.

The rules of any particular game is defined by:

  • The number of red, green and blue things
  • A table of collision effects: what happens when a thing of one colour occupies the same space as a thing of another colour or with the agent - decrementing or incrementing the score, teleportation and/or death (usually different effects for the two parts in a collision)
  • Movement logics - how things of different colours move (random, clockwise, counter-clockwise)
  • The score the player has to reach in order to win the game
  • The time the player has to reach this score before losing the game

In this space, a Pacman-like could be defined using e.g. red things as pills (increments score and disappears when the agent moves over them) and green things as ghosts (kills the player when he touches them, moves randomly). But many other elements are possible, including things that eat other things, teleportation etc.

Here is a screenshot from a little example game:


It should be pretty straightforward to see how game rules can be represented to be evolved: just encode them as e.g. an array of integers, and define some sensible mutation and possibly recombination operators. (In this particular case, we use a simple generational EA without crossover.) For other rule spaces, some rules might be more like parameters, and could be represented as real numbers.

What's the much trickier question is the fitness function. How do you evaluate the fitness of a particular set of game rules? What we want is a fitness function for rulesets that somehow approximates how fun it would be for a human to play a game with that particular ruleset.

In our previous experiments on evolving racing tracks for car games, we used various measures of how a neural network-based controller drove on the track as the fitness function for tracks. In the case of evolving complete games, we can't really judge the behaviour of particular agent on the ruleset, as there is no agent that can play every game.

Our solution is to use learnability as a predictor of fun. A good game is one that is not winnable by a novice player, but which the player can learn to play better and better over time, and eventually win; it has a smooth learning curve. This can be seen as an interpretation of Raph Koster's "Theory of Fun for Fame Design", or of Juergen Schmidhuber's curiosity principle.

Somewhat more technically, our fitness function proceeds in two stages: first it tries to play the game using only random actions. If a random player can win a the game, the ruleset (=the game) is assigned a negative fitness. Otherwise, an evolutionary algorithm is used to try to learn a neural network that plays the game (using the score of the game as fitness function). The fitness of the game then becomes the best fitness found by the "inner" evolutionary algorithm after a certain number of generations.

The important and novel idea here is that a learning algorithm is used as a fitness measure inside another learning algorithm. We call this a dynamic fitness function, to discriminate it from the static fitness functions we used in our papers on track evolution and which did not assume that the agent was learning anything.

Using this setup we managed to evolve a couple of games - none of them very interesting in itself, but together a proof that the idea works. All the details and much more background is in the paper, which is available online.

There is a huge number of questions left to answer - are the generated games with high fitness really more fun for humans than those that have low fitness? Does the evolutionary algorithm learn to play games like a human? Does the approach scale up to more complex games, and different sorts of rule spaces? Well, that means there's lots of interesting research left to do...

16 comments:

Raph Koster said...

Whoa, this is fantastic.

Julian Togelius said...

Thanks! Nice to have my research attract attention from one of the main sources of inspiration for it!

Mark said...

Hey, this is pretty interesting. It reminds me a bit of Joe Marks and Vincent Hom's AIIDE-07 paper "Automatic Design of Balanced Board Games", which also designed game rules using an evolutionary algorithm and "good game" fitness function. The interesting generalization here, as I read it, is that there's now a fitness function for one-player fun, which is a bit more complex and interesting than just "good two-player board games are those that are reasonably well balanced". Also, it's not a two-player, chess-like board game (which AI has done to death).

Julian Togelius said...

Thanks Mark! I think my main innovation is the use of learnability as a fitness function (Marks and Hom, and also Cameron Browne, use tree-search algorithms or Monte Carlo approaches to play the game), but sure, the genre of game as well. I hope I'll be able to present some more extensive results soon!

Zaratustra said...

Very interesting. I had envisioned something of the sorts (where atomic elements are combined into a single game) - you can see it at http://zarat.us/olc/ (Flash required).

The original idea was to use random generation, but I eventually decided to go with user-made content; I may try to mesh your ideas with this system, and if you're interested, I'll keep you updated.

Anonymous said...

It seems to me that your program is not automatically designing games but searching a space of parameters for a game schema that was given, for parameter settings that had the desired properties. You designed the game schema.

Michael Littman said...

Seems interesting. Can you post the games for people to try them out? For the learnability part, you might consider looking at this paper from ICML 2008: http://paul.rutgers.edu/~cdiuk/papers/OORL.pdf . It uses a very similar set of primitives.

Unknown said...

it would be interesting to see if you could integrate in your fitness metric computational models of surprise.

Donny Viszneki said...

Great work Julian! I was leaving a comment when it got a little too long for the comment box. You can find it at codebad.com.

DoomGoober said...

The next step is to have human players rate the generated games and compare their ratings with the fitness function ratings. Currently the weak point of this research is the assumption that a fitness function based on Koster's Theory of Fun is measuring the correct fitness. Running human experiments would validate this assumption.

Taking it a step further, you could use a genetic algorithm to find a good fitness function of fun. The fitness of such a function is simply the amount of correlation between human rankings and the function's rankings. The difficulty would be isolating the genomes for different fitness functions. Currently, the fitness function only has one genome "score across multiple games" but other genomes could be explored such as length of game or inactivity during games (the user not having to do anything.)

Finding a highly correlated fitness function would go far to prove different theories of fun. Great stuff! Keep it up.

Julian Togelius said...

Thanks to all the nice comments! Some replies:

Zircon: Arguably, but what's the difference? Some of the generated games are as different from each other as many maze-based old-school games. Ultimately, the difference between creating a new game design and varying the parameters for an old one is a bit blurry.

Michael Littman: Thanks. I've been thinking about posting the code for a while now... I'll get straight to it one of the next days. Yes, your representation is very similar - it would be easy for you to evolve problems to e.g. differentiate between different reinforcement learning algorithms. (Also something I've thought about for a while, but not gotten around to doing...)

Julian Togelius said...

Stefano: Actually, Schmidhuber's theory of curiosity and novelty (which our paper is partly based on) predates Itti and Baldi's "Bayesian surprise", and contains all the important parts of their theory.

Please see: http://www.idsia.ch/~juergen/interest.html

Julian Togelius said...

DoomGoober: Thanks, and you're absolutely right. Similar plans are in the pipeline. One problem is that you need a lot of data to do this, so you need to construct a system that generates games that people actually want to play, e.g. over the internet...

Bill said...

I'm curious as to what game designers think of this approach. How much is game design a black art? Is the author trying to solve a problem similar to 'automatic story design'?

Anonymous said...

i just know that there is an automatic game design from you.

havchr said...

Brilliant. I had an idea about something similar from a turn-based rpg battle system perspective, but it was nothing but a simple idea.

this post has now inspired me a lot, and pointed me to so many brilliant papers to look at for inspiration. Thanks!
(here's the blog post about automatic battle system)