Imagine you were driving a car in a long, dark, tunnel, and suddenly your headlights started flickering, going off and on irregularly, with intervals of a second or so. What would you do? It seems the only way you could keep from crashing would be to accurately remember the bends of the tunnel. For example, if the lights went out just before a left turn, you would have to predict in how long time the turn starts and start turning appropriately.
Now imagine you were driving a radio-controlled car, but due to some low-grade engineering, there was an unfortunate delay between when you issue a command (such as turning left) and the command has an effect (angling the wheels). How would you handle this? It seems you would have to predict the effects of your turning, so that you started and stopped turning slightly before you seemed to need to.
These two situations are the inspiration for a paper we (Hugo, me and Magdalena) are presenting at the 2007 IEEE-ALife Symposium in Hawaii. Essentially, we wanted to see whether we could force our controllers to learn to predict. Of course, we used my good old car racing simulator for the experiments. To remind you, this is what one of our evolved controllers looks like when all six sensors are turned on and current: (The strange lines represent the sensors)
Now, let's turn of the sensors intermittently and see what happens: (No lines = no sensors)
Not very pretty. Can we improve on this? We tried, by recording the car driving around a few tracks and trying to teach neural networks to predict the future (what sensor input comes next, given current input and action taken). First, we used backpropagation for this. Combining such a predictor with the same evolved controller as before looks like this:
Better than before, but not much.
So we tried another thing. Instead of training the predictor networks to predict, we evolved them for being able to help the controller to drive. It might at first not seem like much of a difference, but in fact it is crucial. Look for yourselves:
CLearly much better. And the difference turns out to be not only quantitative, but also qualitative. But before we go into the analysis, let's look at the other task: the delay task. Below is the same good old evolved controller as in the above examples, but with all sensor inputs delayed by three time steps:
Looks like the driver is drunk, doesn't it?
Let's see if we can do something about this. First, we try to predict the current sensory state from the outdated perceptions, using a predictor trained with backpropagation. We then get something like this:
Pretty terrible. The driver went from drunk to stoned.
The next step was to instead evolve a predictor for maximum performance, as we did with the intermittent task above. Again, the result is strikingly different:
So, what's the take-home message from this? That evolution works better than backpropagation for learning predictors? Not so simple. Because when we analyse the various evolved and trained predictors, it turns out that the evolved predictors don't actually do any prediction! In other words, the mean squared error of the predicted next state and the real next state is quite low for the trained predictors, but horribly high for the evolved ones!
So, again, what does this mean? For one thing, the type of neural networks and the data we are using (only one prior state and action) is not enough to predict the next state as accurately as we would have needed. Therefore the predictors we got with supervised learning were not up to the task. Evolution, on the other hand, quickly figures out that accurate prediction is impossible and decides to go for something else. The evolved predictors instead act as extensions of the controller, changing its behaviour so that it copes with the missing or delayed data better. These changes might include slower driving, higher propensity for turning one way rather than the other, or making sure that when bumping into walls, the back end of the car goes first, rather than the front of the car.
At least, this is what we think happens. Let's say that the topic merits further study... please read the paper if you're interested.
I'm not so sure if any of the above made much sense to you, dear reader. Is my habit of trying to summarise the main points of whole papers a good one? Or does it all just become compressed to the point of unintelligibility? Tell me!