Wednesday, December 12, 2007

On industrial-academic collaboration in game AI

Today was the kick-off event day for the UK Artificial Intelligence and Games Research Network. I wasn't there, but my ex-supervisor is one of the organizers, so I heard about it beforehand.

This is the first blog post I've seen about the kick-off; the author seems to have left the event with a pretty pessimistic view of the prospects for industrial/academic collaboration. His main complaints is that academics don't understand games, and the specific needs of game developers. Well, then tell us! I would love to hear about specific problems in game development where evolution or some other form of machine learning or computational intelligence could matter.

Alex J. Champandard, in a comment on the same blog post, develops the point further. He asks:

So why do you need government funding for [applied games research]? It's a bit like admitting failure :-)

On the other hand, if [academics are] doing research for the sake of research, why do they need input from industry?


These questions can be asked for just about any research project in the interface between academia and industry. And yet companies happily keep funding PhD students postdocs, and even professors in a huge number of research fields, from medicinal chemistry to embedded systems design to bioinformatics. In some cases these collaborations/funding arrangements definitely seem strange, but apparently it makes economic sense to the companies involved.

I once asked an oil company executive (at a party! Now, stop bothering me about what sort of parties I go to...) why his company funds a professor of geology. His answer was roughly that it was good to have expert knowledge accessible somewhere close to you, so you know who to ask whenever you need to. Plus, a professor's salary wasn't really that much money in the grand scheme.

Now, game companies and oil companies are obviously very different sorts of creatures. I think the main opportunity for game companies would be to outsource some of their more speculative research - things that might that not be implementable any time in the near future, either because the computational power is not there yet, or because the technique in question would need to be perfected for a couple of years before deployment. Having a PhD student do this would be much more cost-efficient than assigning a regular employee to do it (especially with government funding, but probably also without), and frees up the employee for actual game development. In addition, the company's own developers might very well be too stuck in the way things currently work to try radically new ideas (of course, academics might also be stuck in old ways of thinking, but there are many academics around and if you offer some funding you can typically select which academic you want to work for you).

This argument assumes that game companies do any sort of research into technologies that lie more than one release cycle away. I'm not stupid enough to claim that no game companies do this - e.g. Nintendo obviously does - but I venture to guess there are many that don't.

As for the other part of Alex's question, "if we do research for the sake of research, why do we need input from industry?", the answer is more obvious. Because even if we do research because we love the subject itself and really want to find out e.g. how to best generalize from sparse reinforcements, we also want to work on something that matters! And fancy new algorithms look best together with relevant problems. It's that simple.

Tuesday, November 27, 2007

New webpage at IDSIA

I've now set up a parallel home page at my new workplace, IDSIA. It's currently mostly a quick overview of the various things I'm involved in academically, but I plan to set up pages with a bit more detail about the projects in that domain as well (for those who don't feel like going straight for the papers).

My primary home page will still contain my publication list, CV and such formal stuff.

Friday, October 19, 2007

A task force and an interview

More games-related news today. The IEEE Computational Intelligence Society has just spawned a Task Force on Computational Intelligence in Video Games, chaired by Ken Stanley (of NERO and NEAT fame) and which I am an inaugural member of. From the mission statement:

"We are aiming to become a repository of information on CI in video games, a networking resource for those in the field, and the spearhead for initiatives in the area. We will also attempt to bridge academia and industry by including members from both. Thus ideally we can become a focal point for discussion and action that will facilitate further progress in the field."

This is a very good initiative in my opinion, and being backed by a such a powerful organisation as the IEEE is certainly not bad. As the web site is only just up, the member list is far from complete yet. The task force is looking for information on interesting research groups and projects, so if you want your project featured, contact them!

Over at Alex J. Champandard's blog, "Game AI for Developers", we find an interview with none other than yours truly. Personally, I think it's an interesting read, of course... Thanks for the opportunity, Alex!

One of the things I suggest in the interview is that game developers initiate contacts with academic researchers interested in CI in games. The above mentioned task force could come in very handy for such purposes, as soon as the member list is expanded to include everyone who should be there!

Wednesday, October 10, 2007

Confessions of an academic crack smoker

Look, I got some attention again. This time from Christer Ericson at Sony Santa Monica, "the God of War team". His blog post is a scathing critique of most of what I've been doing for the last three years, without going into any detail whatsoever, and devoid of constructive suggestions.

I'll try to be less rude.

Christer's argumentation consists in showing one of my early videos with two cars on a track, and pointing out that the AI is not very impressive, as the cars behave erratically and crash into walls. He also makes fun of question I posted to Slashdot, where I was genuinely wondering about what people perceive as being the flaws of current game AI. From this, he implies that my contribution to game AI is null and that I could as well stop what I am doing.

Now, if someone from industry came and argued that what I'm doing is completely useless for game developers, I would take this seriously. Even if he was right, it seems that at least some of what I do is appreciated by the CI community, which is at least equally important to me, so I could accept developer' thinking my ideas were all stupid. However, I would only take such criticism seriously from someone who had actually read my papers and knew what I was doing, and bothered to come up with some suggestions on how to improve my work. None of this is true for Christer's rant.

It's true that the cars in the video don't seem to be driving very well. That was never the objective. Instead, the video is from a series of experiments where I manipulated the fitness function in order to produce interesting driving behaviour. Evolution of controllers that drove a particular track better than any tested human was already reported in our very first car racing paper. It's also true that the cars never learned to recover from some wall crashes. I had wanted this to emerge from the overall progress-based fitness function, which it didn't, and I might get back to work on this later; however, it would be straightforward to either add crash recovery as a specific learning objective, or add a hard-coded function for this. After all, normal game AI is 100%hard-coded.

In short, it would help if Christer either judged my experiments based on their actual objectives, or told me in what way I needed to change my objectives.

It would also help if he looked at some of the work that I myself consider more useful for game development, at least conceptually. (I'm not an expert in graphics, physics, or for that sake real-time collision detection, and don't profess to be one.) Especially the experiments on player modelling and track evolution, but also generalization and specialization for quickly creating drivers for any track, and co-evolution of diverse sets of opponents.

If he read these, and came back and still thought it all stank, I would be very happy to listen to his ideas on how to make my research more relevant for hard-working game developers like him. In the meantime, I'll continue my vacation.

And by the way, I don't smoke.

Monday, October 08, 2007

CEC 2007 Conference Report

So, the 2007 IEEE Congress on Evolutionary Computation is now over. Actually, it's been over for the last ten days. Sorry for taking such time to update my blog, I'm out backpacking at the moment to celebrate finishing my PhD, and I try not to spend all my vacation in front of a computer (even though it's hard fighting that Internet addiction)!

Overall, CEC was an excellent event this year as well. A generous supply of on average really good keynote and invited speakers, so many parallel sessions that there was always something interesting going on, and a superb organization. The only things I would have done differently is spreading the conference out on five or six instead of four days, and not charging money for the tutorials (in fact, many of the tutorials are the same as are included in the general registration for Gecco or PPSN). But those are really minor issues. (A major issue that CEC shares with Gecco and some other conferences is the too low entry barriers / too high acceptance rates, but that's stuff for another blog post.)

Simon's keynote on Evolutionary Computation and Games went down really well, it seems. Apparently, more and more EC researchers are warming up to the idea of using games as testbeds for their algorithms. Simon plugged the car racing competition as well, and there were lots of people talking to me about it in appreciative terms both before and after I presented the results. It seems we have quite a momentum for these kinds of activities at the moment.

Hugo de Garis' invited talk was interesting in a very different way. Actually, it was quite sad. de Garis is known for his huge ambitions and provocative statements, (evolving "artificial brains" as complex as those of kittens, or was it even humans this time around?) so I was looking forward to bold new theories on how such grand aims should be achieved. What followed was some very conventional neuroevolution stuff, and a complete failure to appreciate the real challenge in putting all his evolved neural modules together. Most importantly, he has absolutely no empirical results to show. Predictably, the audience gave him a hard time during the question round.

Other interesting talks included those of Jong-Hwan Kim, the father of RoboCup, on evolvable artificial creatures for ubiquitous robotics, and of Marc Schoenauer on how modern bio-inspired (and population-based) continuous optimisation algorithms such as CMA-ES and PSO now often outperform the orthodox optimisation algorithms used by the applied maths people, on their own benchmark problems. Quite cool.

By the way, did I point out that the organization was superb? Anyway, it deserves saying again. The Stamford convention centre is not only lavishly, but also tastefully, decorated and conference delegates were continuously tended to by an army of servants making sure that we always had something to eat and drink and knew where the venue for the next talk was. The food was simply fantastic, the night safari at the end of the conference was a very nice event, and the conference banquet had nine (!) courses. I can't imagine how our conference fees can have paid for all this - some of the sponsors must have contributed serious money. Rooms were generally easy to find, and most importantly, there was plenty of places where you could just bump into old and new people and have those all-important corridor chats. In all, a very rewarding experience.

Sunday, September 23, 2007

Thesis online

My thesis corrections have now been approved, and the final version is online at http://julian.togelius.com/thesis.pdf

Now I'm off to Singapore to attend CEC, present two papers and the competition results, and have a bit of vacation!

Friday, September 14, 2007

Just passed my viva!

Only minor corrections, will take me a few days to sort out, and them I'm a PhD! External examiner was professor Peter Cowling, University of Bradford (who has a research group on computational intelligence and games), and internal examiner was John Gan.

Yes, it feels fantastic... now we're going out to party! See you!

Tuesday, August 07, 2007

"Advanced Intelligent Paradigms in Computer Games"

Just found this new book from Springer in my mailbox today - it contains a chapter by me, Simon and Renzo on "Computational Intelligence in Racing Games". I'll make it available online soon enough, but almost all of its contents can be found in some of our earlier papers.

Friday, August 03, 2007

The issue of finding those papers...

I read lots of academic papers in my field - though certainly not as many as I "should" - but how do I go about finding them? It sometimes strikes me that I don't really have a good strategy for keeping up to date, or for finding good references when I get a new idea.

I go to conferences, like others do. But obviously I don't go to every conference, and I don't see every presentation on a conference, and I'm not mentally present during every presentation I see. Anything else would be impossible. Worse, conference proceedings are usually only available as hard-to-search CDs or books, instead of for free on the conference website, which would be the sensible option.

There are a few repositories meant to contain papers, or links to papers, in particular research fields, and also to provide good means of finding the papers you want. Sadly, many of them are half-baked.

CoRR (arXiv) have never reached anywhere near the same popularity in Computer Science as it has in physics, probably partly due to weird requirements of submitting the latex source of every paper, something that rarely works in practice. Cogprints have likewise failed to take off, even though the technical platform seems decent enough. Citeseer used to be good around 2002-2003, but seems to have been neglected by its administrators lately (I've had serious problems correcting missing or faulty metadata for my own papers). Bill Langdon's GP Bibliography is excellent, though for a limited domain.

In the best of all world, every paper should be easy to find through Google Scholar. A main obstacle to this is that so many researchers fail to make their papers available on their personal websites. Even in computer science! This is puzzling, and shameful.

I think it is every serious researcher's obligation to make his complete scientific output publicly available on his own home page, unless he/she has a very good excuse. Otherwise one would suspect that he/she has something to hide.

So if you are reading this, and still haven't made all your publications freely downloadable from your website, go and do it. Now. For the sake of science, and your own reputation as an honest scientist. Unless you have a very, very good reason why you shouldn't. And you probably haven't.

(Yes, I do feel quite strongly about this...)

Wednesday, August 01, 2007

How better AI can make racing games more fun

In some previous posts on this blog (e.g. this one, this one and this one) I've been discussing evolving neural networks to drive racing cars around a track. We did this research (published in several papers, e.g. this one and this one) for several reasons, the main motivation being to explore how games can be used as environments in which (artificial) evolution can create complex (artificial) intelligence. The related topics of which evolutionary algorithms and controller architectures (neural networks, expression trees etc.) learn best and fastest have also been investigated.

While the interest in this kind of research from the point of view of artificial/computational intelligence and machine learning is fairly obvious, one might wonder whether it might also have applications in computer games. This is less obvious. For example, most racing games would not benefit from having faster, better driving opponents; who would want to play a racing game where you always finish last? Apparently, minor "cheats" (such as allowing the computer-controlled drivers more complete information than is given to the human player) is enough for game designers to be able to manually create opponents that drive well enough.

Racing games are not alone in this respect: in most game genres (with the notable exception of strategy games like Civilization), game designers have no problems at all coming up with sufficiently (appropriately?) challenging opponents, without resorting to blatant cheats (again, remember that Civilization and its likes are exceptions to this rule). Instead, the challenge for designers is coming up with interesting enough opponents and environments, and doing it fast enough. In fact, this consumes huge amounts of money, and is a major expense post in the development of a new game.

So, the challenge we set ourselves was to use the technology we'd already developed to come up with something that could make racing games (and in the future other games) more fun and interesting.

What we came up with was this: modelling the driving style of a human player, and use our model of the driving style together with an evolutionary algorithm to create new racing tracks that are fun to drive for the modelled player. This combination of player modelling and online content generation has, as far as we know, never been attempted before.

The technical details of (different versions of) our proof-of-concept implementation of this was presented at an SAB Workshop last year, and at the IEEE CIG Symposium in April (read the paper online). A discussion of the experiments will also be included in a chapter in a forthcoming book from Springer. But the basic procedure of the most recent version of our software is as follows:


  • Let the human player drive on a test track, designed to contain different types of challenge (straights, narrow curves, alternating smooth bends). Record the driving speed and lateral displacement (distance from the center of the track) on a large number of points around the track.
  • Take a neural network-based controller, which has previous been evolved to be a competent driver on a large variety tracks, and put it back into the evolutionary algorithm. This time, however, the fitness function is not how well the controller drives the track, but how similar the its driving style is to the human's. Specifically, the more similar the speed and lateral displacement of the neural network-controlled car is to the recorded values of the human driver on the same track, the higher fitness it gets.
  • Next, a track is evolved. For this we need an evolvable representation of the track. We've experimented with a couple of different solutions here, but what currently seems to work best is representing the track as a b-spline, i.e. a sequence of Bezier curves.
  • We also need a fitness function for the track. Here, it should be remembered that we are not looking for a track that is as hard or as easy to drive as possible (that would be easy!), but rather the most fun track for the modelled player. To be able to measure how fun a track is, we looked at the theories of Thomas Malone and Raph Koster. The outcome of the rather long discussion in the paper, is that we try to maximize the difference between average and maximum speed, the maximum speed itself, and the variance in progress between different trials. But you really have to read the discussion in the paper to see the point of this, or possibly another blog post I'll write later.
  • Finally, we evolve the track, using this fitness function and track representation, by driving the controller modelled on the human player on each track and selecting for those tracks in which the controller has maximum speed, maximum difference between average and maximum speed, and maximum progress variance.


Below is a few evolved tracks:





This procedure works well enough in our proof-of-concept implementation, but how well it actually works in a full racing game remains to be tested. The most obvious candidate for testing this would be a racing game that comes with a track editor, such as TrackMania. On the horizon, we could have racing games with endless tracks, that just keeps coming up with the right types of track features as you drive, i.e. the ones which are neither to easy nor too hard, and thus keeps you challenged in the right way.

And of course we have been thinking a bit on how this general idea might be extended to other types of games, we just haven't had any time to do experiments yet...

Wednesday, July 18, 2007

The IEEE CEC 2007 car racing competition

Finally, we've got the CEC version of the car racing competition up and running. Feel free to participate! Indeed, please participate! We're very eager to have as many participants as possible, using as different approaches as possible to how to develop their controllers. It's OK if you don't win - at least, it's OK for me...

I quote from the mail I just sent out to the CIG mailing last:

The competition is an incrementally evolved version of the competition
run for the Computational Intelligence and Games Symposium in April.
If you participated in that competition, you will be able to adapt
your submission to the new format with minimum effort. Even if you
have never heard of the competition before, the software is designed
to be as easy as possible to get started with.

Some main changes when compared to the CIG version of the conference are:

* A prize of 500 US Dollars is awarded to the winner. This is subject
to the winner being a registered attendant at CEC, and to at least 5
of the competitors registering for CEC.

* The software package and API have been extended to better
accommodate value function based control, and the software comes
complete with examples of temporal difference learners and genetic
programming controllers as well as various types of neural networks
and evolutionary algorithms.

* The submission format has changed in order to make sure that any
competitor (as well as the organizers) can easily download and run any
other competitor's submission.

Apart from that, various bug fixes have been made, and the competition
score method has changed slightly.

Tuesday, July 17, 2007

Me, Myself, I, etc.

It's now three weeks since I handed in my PhD thesis. I'm still trying to find out how to wind down; I don't think I've ever been as exhausted as right after handing it in.

The title of the thesis is "Optimization, Imitation and Innovation: Computational Intelligence and Games". It contains the experimental sections of most of the papers I've published so far (I had to omit the Sudoku ones to keep the thesis focused, and also to keep the length down - it is already quite a massive heap of paper). It also contains a number of background chapters situating my research in the context of evolutionary robotics and of game AI, and trying to define some sort of taxonomy of apporaches to computational intelligence and games.

I will of course make it available for download from my home page, but not until I've finished my corrections, which will be issued when I've had my viva, which I really hope will take place in early September. Oh yes, and I need to pass the viva as well. Fingers crossed. In the meantime, if you're interested in a copy of the uncorrected version, just mail me.

Assuming I pass my viva, I will then start my new job as a posdoctoral researcher at IDSIA in Lugano, Switzerland, in November. There, I will be working with Juergen Schmidhuber, who is quite famous for his work on reinforcement learning and recurrent neural networks. The place is full of other intelligent people doing great research as well, such as Faustino Gomez doing very interesting work on neuroevolution. That I'm excited about this goes without saying.

GECCO 2007 conference report

A little late, but I figured I should write something about GECCO, even though so many others have.

I was first author on one paper, which I also presented, and second author on two other papers and a poster, so the conference was quite busy for me. Literally. There was a lot of walking through long corridors, and running back through the same corridors after figuring out a wrong turn had been taken somewhere. Let this be my one comment on the organisation of this year's GECCO: UCL is not a good conference venue. Sure, it's in London, and London is one of the world's capitals and very easily accessible through cheap flights and all that, but UCL is the sort of ancient labyrinth where you would expect to bump into a minotaur at any time. Or at least some trolls, or Jeremy Bentham. There was not a single room where all of the conference attendees could fit at once, severely limiting the potential for these all-important random encounters with other researchers, and some of the talk venues seemed to be ten minutes on foot from each other - if you could find them.

I can safely say that the recent SSCI in Hawaii and CEC in Vancouver were better in at least these respects.

But on to the important question: were the papers any good? Better or worse than CEC?

I don't know if I can, and want to, answer that question. As it is physically impossible to see more than perhaps a fifth of the papers, and I didn't even see that many, it's a bit preposterous to have a firm opinion on that. Also, it is a well-known but seldom-talked-loudly-about fact that there is a certain unhealthy animosity between CEC and GECCO, and I don't want to isolate myself from any of these communities. I suppose it's fair to say that the quality is at least comparable but with the conferences having slightly different focus, with somewhat more of e.g. GP and EDA on GECCO, and somewhat more of the stuff I'm most interested in (e.g. games, robotics, neural nets) on CEC.

Then again, one could make the argument that both of these conferences have their entry barriers set a bit too low. That's why I like the GECCO's approach to not treat poster presentations as full papers, in order to enforce more of a separation between contributions of different quality, but I would rather see that the oral acceptance rate was lowered a bit, so that some of the less original studies that were now presented as full papers were accepted as posters instead. Just my 2p.

Oddly enough, I don't know which papers won the best paper awards. They were presented in a session so early in the morning that no-one could reasonably be expected to attend, and the actual awards (as opposed to the nominations) are not to be found on the GECCO homepage. No-one seems to have blogged about it, either. My own picks for two of the tracks would be the paper on HyperNEAT by David D'Ambrosio and Ken Stanley, and the paper on learning noise by Michael Schmidt and Hod Lipson; the first because it is a really cool new idea which might initiate a new paradigm in developmental systems (in addition to the cell chemistry and graph rewriting paradigms), and the second as the general idea might prove to be very useful for modelling the dynamics of physical robotic systems, something I've become rather intereseted in myself recently. As for the other tracks, I didn't see all the best paper nominees, so I don't really have an informed opinion.

Does anyone of you actually know which papers won the awards?

Wednesday, June 27, 2007

The problem with the iPlayer

So the BBC is going ahead with making much of their content available online - through a system crippled by Microsoft's DRM. The reason they give is that "the right's holders - the people that make the programmes, from Ricky Gervais to the independent producers that account for up to a third of our programming - simply wouldn't have given us the rights to their programmes unless we could demonstrate very robust digital rights management."

Alright. But what about the programs that the BBC produce themselves? Shouldn't they by default be exempt from DRM? As for the independent producers, the BBC by merit of its size should be in such a bargaining position that they could force them to accept DRM-free distribution. But apparently the corporation hasn't got enough spine for that. Sad.

DRM is fundamentally at odds with the spirit of public service. Who is going to stand up against DRM if not the public service media corporations? And what's the point in public service at all if it bows to commercial media's ideas about DRM?

Wednesday, June 06, 2007

Where is Julian?

Is he nowhere to be seen? That is only because he is writing up his thesis. He's been hiding in Ekerö, Sweden for quite a while now, where everything is quiet, idyllic and there is nothing to disturb his thesis writing. See for yourselves how idyllic it is:



But now he's back in England! Rumours have it that he's trying to get the thesis ready for submission in two weeks time. Lots of work, then. Probably not much time for blog posting until that's done.

Thursday, April 19, 2007

Grand and Molyneux on game AI

The Guardian has an article containing short interviews with Peter Molyneux and Steve Grand, two people who have managed to put out commercial games (in one case arguably commercially successful as well) containing "real" AI. Interesting read. They're both essentially pushing the idea that as games get even prettier, the stupidity of current game "AI" will shine through more and more, and so the need for "real" AI will increase, not increase.

I say maybe. While Molyneux's games are a great source of inspiration it's possible that it and its likes will always constitute a niche market, and your average FPS, RTS or movie tie-in adventure will never benefit from a neural network or evolutionary algorithm. But I do hope that I'm wrong here.

Whichever the case, we can still use commercial games for academic research just the same, in order to help us understand natural and computational intelligence. And I think we should. Much more than today.

Tuesday, April 17, 2007

Back from Hawaii, back to reality...





Finally, my first "normal" day since coming back. That is, I plan to spend most of my day in the lab... reviewing some papers for CEC, answering some mails, and starting to write my thesis.

Yes, that's right. The plan is to start writing my PhD thesis. Today. So maybe it's not that normal a day after all. Further, according to the plan, I will hand in my thesis in June and have a viva in September. It's not impossible, I believe.

The IEEE Symposium Series on Computational Intelligence was a good event from a scientific perspective, and an excellent one from a networking perspective. I spent plenty of time drinking cocktails with Games and ALife people, discussing research ideas and completely unrelated stuff.

After the conference we spent a few more days in Hawaii, and me and Hugo went on to do some touristing in San Francisco. Fantastic city, indeed.

Sunday, April 15, 2007

CIG Car Racing Competition results

The winner of the CIG Car Racing Competition is Peter Burrow - congratulations, Pete! He used a modular controller based on two incrementally evolved neural networks, and the nearest competitor, Thomas Haferlach, also used a modular controller based on two neural networks, although CTRNNs rather than the more straightforward networks Pete used.

Aravind Gowrisankar and Matthew Simmerson submitted controllers based on NEAT (NeuroEvolution of Augmenting Topologies) and I submitted a simple hard-coded controller, an simple evolved neural network, as well as (together with Hugo Marques) a controller based on an evolved neural network for controlling the car together with a copy of the whole simulation environment for predicting which car will reach the current way point first. None of these approaches scored as well as the modular controllers of Pete and Tom, but with some more work they might well do.

As for where that work would be submitted, we are planning to run another car racing competition for CEC 2007. That would give contestants several more months (until September) to work on their controllers. We haven't currently decided on the exact details of that competition, but plan to finalise it before the end of April. So if you have any ideas on in what direction to take the competition (changing the interfaces? dynamics model? task? etc) please speak up now!

Saturday, March 31, 2007

In Hawaii

So now I'm in Hawaii, "preparing" for the IEEE Symposium Series which will start tomorrow. Preparing on the beach, that is.

There has been a lot of interest in the CIG Car Racing Competition after the recent media coverage. Unfortunately, the deadline is passed (the results will be presented on Thursday) but given the interest I will definitely look into some way of rerunning the competition, either by attaching it to another conference or making it into some form of permanent league.

Another "Ask Slashdot" of mine recently got on the frontpage - it's about the Most Impressive Game AI. Check it out. Unfortunately I don't know if I can contribute much to the discussion as there's no Internet on the beach. Or, at least, I'm not bringing a laptop to the beach.

Saturday, March 24, 2007

Slashdot and New Scientist

Thread on Slashdot on the car racing research:

Slashdot thread

Contains some interesting discussion, and was soon picked up by New Scientist:

New Scientist article

Hmm... interesting that I am linking to a post that links back to this blog. Self-referential promotion.

Thursday, March 22, 2007

Mini-Grand Challenge, sort of...

So now it's official: we (a team led by Simon) will build a demonstrator model car with onboard computer control, and organize the first model car competitions at WCCI 2008. The IDEA is that a smaller car faces the same problems of navigation, control, computer vision etc. as a full-size car does, but is enormously cheaper to build. So it's like a miniature version of the DARPA Grand Challenge, that anyone can participate in. Of course, this makes it possible to try some riskier approaches, that you would never dare try with a full-size car, such as various machine learning techniques. We hope we will see lots of innovative entrants to the competition.

Most impressive game AI?

I have the feeling that when developers make the effort to put really sophisticated AI into a game, gamers frequently just don't notice (see e.g. Forza). Conversely, games that are lauded for their fantastic AI are sometimes based on very simple algorithms (e.g. Halo 1). For someone who wants to apply AI to games, it is very interesting to know what AI is really appreciated. So, what is the most impressive game AI you have come across? Have you ever encountered a situation where it really felt like the computer-controlled opponents were really thinking, that there were "someone in there"?

Friday, February 23, 2007

Evolution versus td-learning revisited

One of the papers we are presenting at this year's Computational Intelligence and Games symposium is about comparing td-learning and neuroevolution for learning car racing skills.

The paper (go read it!) contains lots of material, and I won't try to summarize the rather dense methods and results sections here. But let me reflect a bit on some of the major conclusions:



  • First of all, td-learning can be blazing fast when it works like it should. Of course td-learning could potentially be faster than evolution as it learns from feedback during the lifetime of an individual, but we didn't expect it to be quite as fast as it sometimes was. A few times we saw a car controller starting from tabula rasa and going to driving decently between waypoints in a few hundred time steps, maybe 20 seconds of simulated time!


  • But to balance the picture, td-learning can be a bitch. Really. Performance is completely unpredictable, the very same parameter configuration gives completely different learning results in successive runs, and the same configuration can learn very well sometimes and not at all at other times. Often, already learned good behaviour is unlearned after a few more epochs. Etc, etc. It is simply much easier to learn something sensible with evolution than td-learning. And in the end, the best evolved controllers are consistently better than the best td-learned controllers.


  • Which brings us on to the question of whether these effects are inherent to the algorithms, or whether they are an artifact of Simon and me being much more familiar with evolution than td-learning. Interesting question. I don't know. We did, however, bring in Thomas Runarsson to help us with the experiments and he's done quite a bit of td-learning in the past.


  • Another interesting thing that came out of our experiments is how good it is to have a forward model available. Evolving state value functions consistently outperformed direct control. I think the use of forward models might very well be the next big thing in evolutionary robotics. We have a couple of exciting ideas for how to do this, now we just need time to get working on that...



Anyway, that's all for today. Slightly more unstructured than the usual. But so am I.

Thursday, February 15, 2007

Car Racing Competition updated

The car racing competition is now updated, with minor changes to the code, submission instructions, and a hall of fame:

http://julian.togelius.com/cig2007competition/

Please consider participating. It's easy to get started, and really fun! (Hah, like I was ever going to tell you it's hard and boring, even if it was... Really, I've tried hard to make it as easy as possible to enter.)

If you decide to give it a go, I would appreciate a mail stating this intention, so I can put your mail address on a list of people to notify in the unlikely event of changes to the rules, code, etc.

Tuesday, February 06, 2007

Sensorless but not senseless

Imagine you were driving a car in a long, dark, tunnel, and suddenly your headlights started flickering, going off and on irregularly, with intervals of a second or so. What would you do? It seems the only way you could keep from crashing would be to accurately remember the bends of the tunnel. For example, if the lights went out just before a left turn, you would have to predict in how long time the turn starts and start turning appropriately.

Now imagine you were driving a radio-controlled car, but due to some low-grade engineering, there was an unfortunate delay between when you issue a command (such as turning left) and the command has an effect (angling the wheels). How would you handle this? It seems you would have to predict the effects of your turning, so that you started and stopped turning slightly before you seemed to need to.

These two situations are the inspiration for a paper we (Hugo, me and Magdalena) are presenting at the 2007 IEEE-ALife Symposium in Hawaii. Essentially, we wanted to see whether we could force our controllers to learn to predict. Of course, we used my good old car racing simulator for the experiments. To remind you, this is what one of our evolved controllers looks like when all six sensors are turned on and current: (The strange lines represent the sensors)



Now, let's turn of the sensors intermittently and see what happens: (No lines = no sensors)



Not very pretty. Can we improve on this? We tried, by recording the car driving around a few tracks and trying to teach neural networks to predict the future (what sensor input comes next, given current input and action taken). First, we used backpropagation for this. Combining such a predictor with the same evolved controller as before looks like this:



Better than before, but not much.

So we tried another thing. Instead of training the predictor networks to predict, we evolved them for being able to help the controller to drive. It might at first not seem like much of a difference, but in fact it is crucial. Look for yourselves:



CLearly much better. And the difference turns out to be not only quantitative, but also qualitative. But before we go into the analysis, let's look at the other task: the delay task. Below is the same good old evolved controller as in the above examples, but with all sensor inputs delayed by three time steps:



Looks like the driver is drunk, doesn't it?

Let's see if we can do something about this. First, we try to predict the current sensory state from the outdated perceptions, using a predictor trained with backpropagation. We then get something like this:



Pretty terrible. The driver went from drunk to stoned.

The next step was to instead evolve a predictor for maximum performance, as we did with the intermittent task above. Again, the result is strikingly different:



So, what's the take-home message from this? That evolution works better than backpropagation for learning predictors? Not so simple. Because when we analyse the various evolved and trained predictors, it turns out that the evolved predictors don't actually do any prediction! In other words, the mean squared error of the predicted next state and the real next state is quite low for the trained predictors, but horribly high for the evolved ones!

So, again, what does this mean? For one thing, the type of neural networks and the data we are using (only one prior state and action) is not enough to predict the next state as accurately as we would have needed. Therefore the predictors we got with supervised learning were not up to the task. Evolution, on the other hand, quickly figures out that accurate prediction is impossible and decides to go for something else. The evolved predictors instead act as extensions of the controller, changing its behaviour so that it copes with the missing or delayed data better. These changes might include slower driving, higher propensity for turning one way rather than the other, or making sure that when bumping into walls, the back end of the car goes first, rather than the front of the car.

At least, this is what we think happens. Let's say that the topic merits further study... please read the paper if you're interested.

I'm not so sure if any of the above made much sense to you, dear reader. Is my habit of trying to summarise the main points of whole papers a good one? Or does it all just become compressed to the point of unintelligibility? Tell me!