Assisting Stéphane Doncieux with the practicals on Evolutionary Algorithms over the past few years, I had the chance to learn more about evolutionary algorithms and the DEAP library. I have put together some notes—a high-level summary—with the main terminology and key ideas, which I hope will be useful for newcomers.
“all species of organisms arise and develop through the natural selection of small, inherited variations that increase the individual’s ability to compete, survive, and reproduce.”
Darwin
Evolutionary algorithms take inspiration from the Darwinian theory of biological evolution and refer to methods such as mutation, selection, and recombination of solutions to solve optimization problems. An example of an optimization problem can be teaching a simple creature to walk (https://rednuht.org/genetic_walkers/)
In order to promote upright walking, each creature receives points based on how high the head is in relation to its feet, and how much it advances while upright. Bonus points are given for each proper step forward.
Vocabulary
- A Mutation is an alteration in the nucleic acid sequence of an organism’s genome, virus, or extrachromosomal DNA, resulting in genetic variations that may or may not affect an organism’s observable characteristics (phenotype). In genetic algorithms, mutations are used to maintain and introduce diversity in the genetic population.
- Selection refers to selective breeding or the process by which humans use animal and plant breeding to develop particular phenotypic traits (characteristics) selectively. In genetic algorithms, selection is the stage in which individual genetic information is chosen from a population for later breeding (using the crossover operator).
- The crossover operator, also called recombination, is a genetic operator used to combine the genetic information of two individuals to generate new offspring.
The success of a solution is evaluated based on its fitness. A population fits the environment when it survives and thus is more likely to reproduce. In evolutionary algorithms, the fitness function ranks candidates for the best solution to a problem.
Canonical evolutionary algorithms start by randomly initializing a population, then encoding the population into chromosomes (i.e., the set of parameters that define a proposed solution), defining the fitness function, and selecting which individuals are to survive and reproduce.
Technically, evolutionary computation is an example of heuristic search, i.e., search by trial and error, where in evolutionary algorithms, the ‘trials’ are candidate solutions, and the ‘error’ is the measurement of how far a trial is from the desired outcome. The error is used to select which trials are to be used to generate further trials. The fundamental rule of thumb is that the best chance to reduce the error further is to create new trials by making modifications to the previous trials that had the lowest errors. [sciencedirect.com]
At their core, evolutionary algorithms operate on the premise that a population of potential solutions, often represented as individual candidate solutions, undergoes a process of selection, crossover (recombination), and mutation, emulating the fundamental mechanisms of genetic inheritance and variation found in nature. The primary objective is to gradually evolve the population toward better solutions, ultimately converging towards an answer that meets the desired accuracy or optimization criteria.
One of the key applications of evolutionary algorithms is in policy optimization, where the goal is to find an optimal set of parameters or decision-making rules for a given system. By leveraging evolutionary principles, these algorithms can efficiently explore complex, high-dimensional solution spaces, making them particularly suited for problems where traditional optimization methods may falter.
In the quest for optimal solutions, evolutionary algorithms often begin with relatively small populations of candidate solutions. These populations evolve over multiple generations, with each generation’s individuals being evaluated, selected, and modified to yield improved offspring. The random nature of mutation and the competition for survival among individuals ensure diversity in the population, allowing the algorithm to explore a wide range of potential solutions.
Elitist-like Evolutionary Algorithms
Elitism is an important concept in evolutionary algorithms. The three essential components of elitism are the individual, the population, and the generation. Let’s take a closer look at each of them:
- Individual: an individual represents a possible solution to an optimization problem. Individuals can be modeled in various ways depending on the specific problem being addressed.
- Population: a population consists of a set of individuals, which means it encompasses multiple possible solutions to the optimization problem being considered at the same time.
- Generation: a generation refers to a population created at a specific point during the execution of an evolutionary algorithm. As the algorithm progresses, new generations emerge, and the solutions for the optimization problem get refined.
In short, we can see elitism in evolutionary algorithms as a concept that enforces one or two of the conditions next:
- The straightforward inclusion of the best individuals of a generation in the population of the following generation.
- The use of the best individuals of a generation a minimum number of times to crossover with other random individuals for creating the following generation.
Thus, before applying elitism to create new generations of individuals, we must evaluate the current generation to find the best individuals. For that, evolutionary algorithms calculate and compare an optimization function (or objective function) for each available individual of a population.
The main purpose of using elitism in evolutionary algorithms is to keep the reference for promising areas of the search space across the generations. In practice, elitism enables the continuous exploitation of these promising areas (where we can find a local or global optima result).
Moreover, elitism also guarantees the presence of the best individuals found considering the entire processing of an evolutionary algorithm in the last generation created, which is the final result.
Tuning Elitism in Evolutionary Algorithms
Adopting elitism, however, may also harm evolutionary algorithms. As it propagates already known individuals to follow generations, the possibility of finding new individuals better fitted to solve the problem reduces. So, elitism inevitably reduces the diversity of a population and decreases the algorithm’s capacity to explore the search space.
There is no concrete and definite size for defining the elite group. Usually, this size is determined as a percentage of the population size.
A widely employed tuning for elitism considers the following configurations:
- The best 10% individuals of a population in a generation form the entire elite group
- The best 50% of individuals in the elite group directly propagate to the next generation
- The worst 50% of individuals in the elite group crossover with non-elite individuals
We should note, however, that the best tuning of elitism in evolutionary algorithms depends on the problem and search space characteristics: there is no global and predefined best configuration.
Finally, it is also important to emphasize that employing elitism is not mandatory for evolutionary algorithms. Again, the option to use it or not highly depends on the specific optimization problem being solved.
As this is a work-in-progress post, I will be adding other general definitions:
- Genetic algorithm – This is the most popular type of EA. One seeks the solution of a problem in the form of strings of numbers (traditionally binary, although the best representations are usually those that reflect something about the problem being solved),[2] by applying operators such as recombination and mutation (sometimes one, sometimes both). This type of EA is often used in optimization problems.
- Genetic programming – Here the solutions are in the form of computer programs, and their fitness is determined by their ability to solve a computational problem. There are many variants of Genetic Programming, including Cartesian genetic programming, gene expression programming, grammatical evolution, linear genetic programming, multi expression programming etc.
- Evolutionary programming – Similar to genetic programming, but the structure of the program is fixed and its numerical parameters are allowed to evolve.
- Evolution strategy – Works with vectors of real numbers as representations of solutions, and typically uses self-adaptive mutation rates. The method is mainly used for numerical optimization, although there are also variants for combinatorial tasks.[6][7]
- Differential evolution – Based on vector differences and is therefore primarily suited for numerical optimization problems.
- Neuroevolution – Similar to genetic programming, but the genomes represent artificial neural networks by describing structure and connection weights. The genome encoding can be direct or indirect.