What we're trying to do is search a space of game rules for rule sets that constitute fun games. This immediately raises two questions: how do you define and search a space of game rules, and how can you measure whether a game is fun?
We decided to create set of "meta-rules" or "axioms" that define a space of grid-based games, where one of the games that could be created would be a simple version of Pac-man. The game arena ("maze") is very simple (just a few walls) and is the same for all games. The player controls an agent (the purple blob in the figure below) and can move one block up, down, left or right every time step. Apart from the agent, there are a number of red, green and blue "things" in a game. They are deliberately called "things" and not opponents, food, mines, collaborators etc. because their exact relation to the player is decided by the rules of the game.
The rules of any particular game is defined by:
- The number of red, green and blue things
- A table of collision effects: what happens when a thing of one colour occupies the same space as a thing of another colour or with the agent - decrementing or incrementing the score, teleportation and/or death (usually different effects for the two parts in a collision)
- Movement logics - how things of different colours move (random, clockwise, counter-clockwise)
- The score the player has to reach in order to win the game
- The time the player has to reach this score before losing the game
In this space, a Pacman-like could be defined using e.g. red things as pills (increments score and disappears when the agent moves over them) and green things as ghosts (kills the player when he touches them, moves randomly). But many other elements are possible, including things that eat other things, teleportation etc.
Here is a screenshot from a little example game:
It should be pretty straightforward to see how game rules can be represented to be evolved: just encode them as e.g. an array of integers, and define some sensible mutation and possibly recombination operators. (In this particular case, we use a simple generational EA without crossover.) For other rule spaces, some rules might be more like parameters, and could be represented as real numbers.
What's the much trickier question is the fitness function. How do you evaluate the fitness of a particular set of game rules? What we want is a fitness function for rulesets that somehow approximates how fun it would be for a human to play a game with that particular ruleset.
In our previous experiments on evolving racing tracks for car games, we used various measures of how a neural network-based controller drove on the track as the fitness function for tracks. In the case of evolving complete games, we can't really judge the behaviour of particular agent on the ruleset, as there is no agent that can play every game.
Our solution is to use learnability as a predictor of fun. A good game is one that is not winnable by a novice player, but which the player can learn to play better and better over time, and eventually win; it has a smooth learning curve. This can be seen as an interpretation of Raph Koster's "Theory of Fun for Fame Design", or of Juergen Schmidhuber's curiosity principle.
Somewhat more technically, our fitness function proceeds in two stages: first it tries to play the game using only random actions. If a random player can win a the game, the ruleset (=the game) is assigned a negative fitness. Otherwise, an evolutionary algorithm is used to try to learn a neural network that plays the game (using the score of the game as fitness function). The fitness of the game then becomes the best fitness found by the "inner" evolutionary algorithm after a certain number of generations.
The important and novel idea here is that a learning algorithm is used as a fitness measure inside another learning algorithm. We call this a dynamic fitness function, to discriminate it from the static fitness functions we used in our papers on track evolution and which did not assume that the agent was learning anything.
Using this setup we managed to evolve a couple of games - none of them very interesting in itself, but together a proof that the idea works. All the details and much more background is in the paper, which is available online.
There is a huge number of questions left to answer - are the generated games with high fitness really more fun for humans than those that have low fitness? Does the evolutionary algorithm learn to play games like a human? Does the approach scale up to more complex games, and different sorts of rule spaces? Well, that means there's lots of interesting research left to do...