Tuesday, July 11, 2006

Reinforcement learning - what is it, really? And why won't it work?

At the moment I'm working on several car racing-related projects simultaneously, but the one that's receiving the most attention is trying to compare evolution with reinforcement learning, to see if I can achieve the same results with those methods.

Well, I suppose I should say that I try to compare evolution with other forms of reinforcement learning. After all, evolutionary algorithms are just one set of ways of solving reinforcement learning problems.

It turns out not to be very easy at all to get anything working. I've tried learning values of state-action pairs from a good driver; this might not really be reinforcement learning but rather some sort of supervised learning, and anyway it doesn't work. I'm now working on simultaneous learning of forward models and sensor-state value estimators, which frankly seems unnecessarily complicated.

Of course, it must be possible to apply reinforcement learning to car driving, and I'm sure people have done it. But I am pretty sure it has not been done with the limited information I'm giving the controller. Anything is easy when you cheat, and part of my research program is not to cheat.

Anyway, I'm off to CEC in a few days. I'm bringing the Sutton and Barto book to read on the flight, hopefully I'll get an insight or two.

No comments: