Togelius

Wednesday, October 25, 2006

Simulated car racing competition

I am running the simulated car racing competition for CIG 2007. The problem is a bit different from the versions of the car racing problem we've been exploring so far in that there are no walls, and the next two waypoints are visible to the controller. But there are similarities as well - in fact, quite a bit of the simulation code is reused from my earlier experiments...

Please consider taking part in the competition - the final rules are not online yet (will be really soon, promise), but the code is there for you to start playing around with. It will be really interesting to see what sort of controller the winner will be - hand-coded? Evolved neural network? Fuzzy logic? Genetic programming? Learned through temporal difference learning? Something completely different? (Probably it will be some sort of hybrid of man and machine efforts. Hybrids always win in the end.)

So, off you go. Have a look at the competition, and start designing your controllers.

Yes, I mean you.

Monday, October 09, 2006

That's more like me

Richard needed a face to use on a cube for his humanoid robot to play with (don't ask) so he put me up against the wall and took a shot. I think it captures the inside of me pretty well, maybe better than it captures the outside of me.

Friday, October 06, 2006

SAB 2006 Workshop on Adaptive Approaches for Optimizing...

...Player Satisfaction in Computer and Physical Games is the rather long title of a workshop I visited in Rome last week. It was organized by Georgios Yannakakis and John Hallam, who wish this to be the first of a series of workshops dealing with how various computational intelligence techniques can be used to make games more entertaining - a most laudable initiative, and a good start to the series. The workshop featured seven academic papers and one invited talk from Hakon Steinö, and of course lots of good discussion over pizza and white russian.

Our paper there had a long title too: Making racing fun through player modeling and track evolution. I must say that I think this is a quite good paper, definitely one of the better I've (co-)written. It deals with how to identify and reproduce a human player's driving behaviour in a racing game, and use thus behavioural model to automatically create tracks that are "fun" for the player.

Of course, how to measure "fun" in a game is a question which is far from settled. But an interesting question, and potentially industrially relevant. The issue of automatically creating content (e.g. racing tracks) for games seems to be quite hot as well - fertile ground for research indeed.

Wednesday, September 13, 2006

PPSN 2006 Conference report

So, the conference is over now, and I look forward to a few days of doing nothing at all. (Well, maybe visit a museum or two, or go hiking on a glacier, but I mean nothing involving evolutionary computation.) The last afternoon was spent in the Blue Lagoon, an outdoor thermal bath, making this one of the precious few scientific conferences where you get to see the other participants in only swimwear. A shame, though, that the population in question has so little "diversity".

So, how was the conference itself? Good. Very well organized, and fabulous for networking, because of the small size, poster-only presentations, and generally good atmosphere. It really made you feel part of the community, much more so than for example CEC does. So much part of the community that I'm really reluctant to write anything negative about the conference. (Maybe I should get an anonymous blog as well?)

The one thing I could complain about was that the conference was rather focused on theory and basic empirical research, and I might be a bit more of an applications guy. That's more of a problem with me than with the conference, however. And there were plenty of papers I did enjoy. I tend to enjoy theory when I understand it. My good friend Alberto won the best student paper award, which I think was well-deserved. His theory is one of those I do understand.

We also had some very good keynotes, especially the one from Edward Tsang on computational finance. And it was possibly to pick up some of the trends in the community. Basically everybody seems to be involved in Multiobjective Optimization, which I find potentially useful, and quite a few people are doing Evolution Strategies with Covariance Matrix Adaptation, which I find completely incomprehensible.

But I must say that quite a few people took an interest in my own paper as well, in spite of it being so different from most other papers there. Or maybe because of that.

Tuesday, September 12, 2006

New papers online

Two new papers on the evolutionary car racing project are now available on my home page. One will be presented on a small workshop in Rome in a few weeks time, the other was presented at PPSN yesterday. Yes, that means I am in Iceland right now. What it's like over here? Cold and expensive. But not without its charm. I'll be back soon with a conference report of some sort.

Tuesday, August 01, 2006

Learning to fly

The Gridswarms project, on which my friend Renzo is working, is about creating swarms of miniature helicopters that fly around and share their computing resources in order to perform tasks in a coordinated manner, inspired by the way they do it on the Discovery Channel. (That is, inspired by cooperation among insects. Nothing else.)

Yes, it sounds pretty science fiction, and yes, it's been covered several times by the media, e.g. Wired. And I'm pretty sure they will succeed as well.

I'm not officially part of that project, but as I tend to hang out with Renzo all the time, we cooperate quite a bit on our research as well. Only on things we both understand, of course; I hardly know anything about hardware design, Kalman filters and such stuff. One problem I could help him with, though, is the automatic design of the basic helicopter controller.

You see, the plan is to have the helicopters perform all sorts of interesting flocking and swarming behaviour, perhaps coordinated target tracking, or distributed surveillance. But these advanced behaviours need to be built on top of a reliable flight controller layer, whose task it is to keep the machine flying in the face of various disturbances, and implement the orders (e.g. "go to point x, y, z, but quickly!") of the swarming layers.

Designing such a controller by hand is in no way easy. It is even harder than actually flying the helicopter, which I can tell you is not easy at all. So we turned to evolutionary algorithms and neural networks to do the job for us.

Now, what an evolutionary algorithm does is essentially systematical trial-and-error on a massive scale. Doing massive trial and error on real helicopters (albeit miniature) would be a bit... expensive. Not to mention slow. So we needed to do the evolution in simulation, and because the real helicopters are still under developments we had to contend with a third-party (very detailed) helicopter simulation for the time being.

Without further ado, this is what it looks like. Remember that we have provided absolutely no knowledge of how to fly the helicopter to the neural network, it has learnt it all by itself, through evolution!

As you might or might not see from the movie, the task is to fly a trajectory defined by a number of waypoints (yellow). (As should be evident from the movie, the helicopter is shown from behind in the right panel and from above in the left panel.) Our evolved neural networks perform this task much better than a hand-coded PID controller, and is reliable even under various disturbances, such as changing wind conditions. And as far as we know, we are the first people in the world to successfully evolve helicopter control.

Getting there wasn't all that easy, though. To begin with, the computational complexity is absolutely massive - we used a cluster of 34 Linux computers, and even then a typical evolutionary run (a hundred generations or so) took several hours. What we also discovered was that no matter how much computer power you have, the right behaviour won't evolve if the structure of the neural network is wrong. It took us a lot of time to find out, but a standard fully connected MLP net won't cut it. Instead you have modularize either according to the dimensions of control or according to...

...but hey, I'm losing you. I can't get too far into technical details on a blog, can I? Go read the paper we presented at CEC two weeks ago instead. It is available online, click here!

Anyway, there is more work to do on the project, and I'll get back to this topic!

Sunday, July 30, 2006

WCCI 2006 conference report

As of yesterday, I'm back from Canada. Vancouver is fantastic, and I certainly wouldn't mind moving there. The conference (WCCI/CEC) was not bad either, though a bit too big - how are you supposed to find the information you are interested in among a total of almost 1600 papers, including 400-500 on evolutionary stuff?

There was some excellent stuff presented, including the keynotes by Sebastian Thrun and Risto Miikkulainen, and some individual papers, such as a simple but ingenious co-evolutionary algorithm by Thomas Miconi.

As usual, there was also a lot of "noise" - it is astonishing how many papers are presented that, while not being technically incorrect, makes insignificant progress on insignificant topics, and just makes you wonder why. Why did anyone care to write these papers, and then travel far away to present them to not very interested audiences? Because the authors didn't have any better research ideas, and desperately needed to lengthen their publication lists in order to get their PhDs / obtain funding / get tenure etc.? Probably.

As Philip Ball notes in his book Critical Mass, more than half of all scientific papers don't get cited at all, except by their authors. Makes you think. (And no, I haven't had that many citations either - yet...)

Anyway, back to the conference. It all started out with a quite amusing keynote by Robert Hecht-Nielsen, who presented an "Architecture of cognition", ~~speaking~~ preaching with no shortage of confidence and enthusiasm. What he actually presented was an example application based on a form of Hebbian learning, which could generate complete sentences based on afew trigger words and large amounts of previously scanned text. From a linguistic standpoint his invention indeed seems rather radical, as it produces correct English grammar without any explicit rules of grammar at all. But the suggestion that it has any kind of "real" intelligence is quite stupid, as the sentences produced were most often simply not true, and there is no mechanism by which it can learn from mistakes or suppress false information. Intelligence requires some sort of interaction with the environment, and I'm more prepared to say that a thermostat is intelligent than that Hecht-Nielsens program is. As someone on the conference said, what he presented was a bullshit engine. A good one apparently, and potentially useful, but still only a bullshit engine. A little bit like Eliza.

My own talk received fairly good feedback, and I had some stimulating discussions, including some industry people. I'm looking forward to PPSN, though, where the format of the presentations is more focused on interaction.

Wednesday, July 12, 2006

Another evolutionary car racing video

Here is an evolved car controller navigating a rather difficult track. I usually do worse than the controller does when I try to drive that track myself, frequently colliding with walls and having to back up and try again. Note that it seems impossible to evolve a controller for this track from scratch, instead I had to start from a controller that was previously evolved to drive several easier tracks. Evolution had to progress in an incremental fashion.

For more on this, read my previous post on evolutionary car racing.

Tuesday, July 11, 2006

Reinforcement learning - what is it, really? And why won't it work?

At the moment I'm working on several car racing-related projects simultaneously, but the one that's receiving the most attention is trying to compare evolution with reinforcement learning, to see if I can achieve the same results with those methods.

Well, I suppose I should say that I try to compare evolution with other forms of reinforcement learning. After all, evolutionary algorithms are just one set of ways of solving reinforcement learning problems.

It turns out not to be very easy at all to get anything working. I've tried learning values of state-action pairs from a good driver; this might not really be reinforcement learning but rather some sort of supervised learning, and anyway it doesn't work. I'm now working on simultaneous learning of forward models and sensor-state value estimators, which frankly seems unnecessarily complicated.

Of course, it must be possible to apply reinforcement learning to car driving, and I'm sure people have done it. But I am pretty sure it has not been done with the limited information I'm giving the controller. Anything is easy when you cheat, and part of my research program is not to cheat.

Anyway, I'm off to CEC in a few days. I'm bringing the Sutton and Barto book to read on the flight, hopefully I'll get an insight or two.

Tuesday, June 27, 2006

Sticky stuff

Those stickers had been disgracing mine and Renzo's already ugly computers for much too long. Finally we decided to put them where they belong.

Hmm... what more can you do with these stickers?