How do we know that the next generation will be better?

Published on Author Code Father
How do we know that the next generation will be better?

A genetic algorithm tries to improve at each generation by culling the population. Every member is evaluated according to a fitness function, and only a high-scoring portion of them is allowed to reproduce.

You’re right, though: there is no guarantee that the next generation will improve on its predecessor’s score.

Consider Dawkins’ weasel program: “evolving” the string "Methinks it is like a weasel". Starting from a population of random strings, the fitness function evaluates the closest textual match, which is perturbed to produce the next generation. With a simple crossover reproduction, two high-scoring strings that are combined could very easily produce lower-scoring offspring. Even “asexual” random mutation of a single high-fitness string could lower the child’s fitness.

It’s worth noting, I think, that this is not necessarily a flaw. With this kind of search, there is the idea of local maxima. A member of the population might represent a solution that’s not the optimal result, but is the best that can be achieved without getting worse on the way.

Imagine that the fitness function for the weasel program doesn’t only find the edit distance, but has some notion of “word”, and tests whether the last word of the string is the name of an animal. Any animal name scores well, but "weasel" gets a big bonus.

Now what happens if "Methinks it is like a walrus" is evolved? It scores well. Not as well as the ultimate target string, but better than "Methinks it is like a walrut" or other close variations that could be reached by a single step of mutation.

The walrus string is a local maximum, and the search can get stuck there unless the program allowsthe next generation’s score to be worse.