Sunday, July 30, 2006

WCCI 2006 conference report

As of yesterday, I'm back from Canada. Vancouver is fantastic, and I certainly wouldn't mind moving there. The conference (WCCI/CEC) was not bad either, though a bit too big - how are you supposed to find the information you are interested in among a total of almost 1600 papers, including 400-500 on evolutionary stuff?

There was some excellent stuff presented, including the keynotes by Sebastian Thrun and Risto Miikkulainen, and some individual papers, such as a simple but ingenious co-evolutionary algorithm by Thomas Miconi.

As usual, there was also a lot of "noise" - it is astonishing how many papers are presented that, while not being technically incorrect, makes insignificant progress on insignificant topics, and just makes you wonder why. Why did anyone care to write these papers, and then travel far away to present them to not very interested audiences? Because the authors didn't have any better research ideas, and desperately needed to lengthen their publication lists in order to get their PhDs / obtain funding / get tenure etc.? Probably.

As Philip Ball notes in his book Critical Mass, more than half of all scientific papers don't get cited at all, except by their authors. Makes you think. (And no, I haven't had that many citations either - yet...)

Anyway, back to the conference. It all started out with a quite amusing keynote by Robert Hecht-Nielsen, who presented an "Architecture of cognition", speaking preaching with no shortage of confidence and enthusiasm. What he actually presented was an example application based on a form of Hebbian learning, which could generate complete sentences based on afew trigger words and large amounts of previously scanned text. From a linguistic standpoint his invention indeed seems rather radical, as it produces correct English grammar without any explicit rules of grammar at all. But the suggestion that it has any kind of "real" intelligence is quite stupid, as the sentences produced were most often simply not true, and there is no mechanism by which it can learn from mistakes or suppress false information. Intelligence requires some sort of interaction with the environment, and I'm more prepared to say that a thermostat is intelligent than that Hecht-Nielsens program is. As someone on the conference said, what he presented was a bullshit engine. A good one apparently, and potentially useful, but still only a bullshit engine. A little bit like Eliza.

My own talk received fairly good feedback, and I had some stimulating discussions, including some industry people. I'm looking forward to PPSN, though, where the format of the presentations is more focused on interaction.

Wednesday, July 12, 2006

Another evolutionary car racing video

Here is an evolved car controller navigating a rather difficult track. I usually do worse than the controller does when I try to drive that track myself, frequently colliding with walls and having to back up and try again. Note that it seems impossible to evolve a controller for this track from scratch, instead I had to start from a controller that was previously evolved to drive several easier tracks. Evolution had to progress in an incremental fashion.

For more on this, read my previous post on evolutionary car racing.

Tuesday, July 11, 2006

Reinforcement learning - what is it, really? And why won't it work?

At the moment I'm working on several car racing-related projects simultaneously, but the one that's receiving the most attention is trying to compare evolution with reinforcement learning, to see if I can achieve the same results with those methods.

Well, I suppose I should say that I try to compare evolution with other forms of reinforcement learning. After all, evolutionary algorithms are just one set of ways of solving reinforcement learning problems.

It turns out not to be very easy at all to get anything working. I've tried learning values of state-action pairs from a good driver; this might not really be reinforcement learning but rather some sort of supervised learning, and anyway it doesn't work. I'm now working on simultaneous learning of forward models and sensor-state value estimators, which frankly seems unnecessarily complicated.

Of course, it must be possible to apply reinforcement learning to car driving, and I'm sure people have done it. But I am pretty sure it has not been done with the limited information I'm giving the controller. Anything is easy when you cheat, and part of my research program is not to cheat.

Anyway, I'm off to CEC in a few days. I'm bringing the Sutton and Barto book to read on the flight, hopefully I'll get an insight or two.