Evolutionary Algorithms

1. Main Concept

An evolutionary algorithm is an optimisation algorithm for black box functions (the gradient is unknown). This is a reinforcement learning problem. The concept is:

We maintain a population of individuals (different genotypes).
We evaluate their fitness on the black box function (phenotype).
The fittest individuals are selected to reproduce and create the next generation.

There are three main operators:

Selection: Selects solutions that will reproduce.
Crossover: Mixes parents genotype.
Mutation: Type and frequency variation applied to genotypes.

Type	Genotype	Crossover	Mutation
Genetic Algorithm	Binary string	Swap portions of string	Random bit-flip
Genetic Programming	Program tree	Swap subtrees	Random symbol changes in tree
Evolutionarny Strategies	String of reals	Not Used	Draw from Gaussian distribution

They are easy to parallelize but slower than gradient descent when gradient is known and the problem is simple.

2. Genetic Algorithms

To define a genetic algorithm, we need to define:

Fitness function represents problem to solve. Maximising this function should lead to the optimal solution.
Genotype & Phenotype represent the potential solutions to the problem. This is fed into the fitness function (e.g. a binary string).
Selection selects parents for the next generation. For example, a biased roulette wheel selection (selecting fitter individuals more often).
Crossover combines parents to create offspring. For example, one-point crossover (randomly split, and swap both parts).
Mutation explores nearby mutation. For example, randomly flipping bits. We could add specific mutations to the problem.
Stopping Criterion are usually specific fitness value, maximum generations, or convergence.

2.1 Biased Roulette Wheel Selection

The probability of selecting individual is where is the fitness of individual . We pick a random number and select the individual where the cumulative probability exceeds : .

2.2 Tournament Selection

Randomly draw two individuals from the population.
Select the one with higher fitness as a parent.
Repeat until enough parents are selected.

This method is less susceptible to premature convergence, and easier to parallelise.

2.3 Elitism

Elitism ensures the best individuals are carried over to the next generation without alteration. This prevents losing the best solutions due to random chance. This is usually fixed to of the population.

3. Evolutionary Strategies

Here, the genotype is a list of reals, parent selection is uniform and mutation is generated from a gaussian.

3.1 (μ + λ) ES

Randomly generate individuals.
Evaluate population.
Select best individuals as parents.
Generate offsprings where and is a randomly selected parent.
The new population is .
Repeat from (2).

Usually . The main challenge of μ + λ is choosing .

A large sigma would quickly converge but be hard to refine.

A small sigma would refine well, but take longer to converge and get stuck in local optima.

We could update sigma over time by adding to the genotype:

Where is the learning rate. Heuristically, where .

4. Novelty Search

A novelty search algorithm uses an archive to store previously seen behaviours. The novelty of an individual is measured by its average distance to its nearest neighbours in the archive: .

For the distance metric, we define a space of behavioral descriptors, which capture important aspects of the phenotype.

For example, in a robot maze navigation task, the behavioral descriptor could be the final position of the robot. This is not directly related to the fitness (e.g. reaching the goal), but captures the essence of the behavior.

Instead of optimising the quality of the solution, NS optimises the novelty. We do this by using novelty score instead of fitness for evaluation.

5. Quality-Diversity Optimisation

We want both diverse and high quality solutions. QD algorithms follow these steps:

A stochasitic selection is taken from the collection of solutions. Usually uniform, but can be biased towards quality/novelty.
Offspring are generated via mutation/crossover.
Each offspring is evaluated for quality (fitness) and behavioral descriptor.
The offspring is added to the collection if it is either:
- More novel than existing solutions (based on a novelty threshold).
- Higher quality than the existing solution in the same behavioral niche.

5.1 MAP-Elites

Multidimensional Archive of Phenotypic Elites is a simple QD algorithm to discretise the behavioral descriptor space in a grid and try to fill it with the best solutions. Each new solution fills a cell corresponding to the behavioral descriptor. If the cell is already occupied, it replaces the existing solution if it has higher fitness.

It takes as a hyperparameter the size of the grid. Its easy to implement but density is not necessarily uniform.

5.2 Quantifying Performance

We can find the diversity (archive size) and performance (max or mean fitness value), as well as the convergence speed of these two points. We can summarize this in a QD-score (the sum of the fitness of all solutions in the archive).

The tradeoff between these can be represented in a Pareto front.

Back to Home

Table of Contents