Evolutionary Algorithms
1. Main Concept
An evolutionary algorithm is an optimisation algorithm for black box functions (the gradient is unknown). This is a reinforcement learning problem. The concept is:
- We maintain a population of individuals (different genotypes).
- We evaluate their fitness on the black box function (phenotype).
- The fittest individuals are selected to reproduce and create the next generation.
There are three main operators:
- Selection: Selects solutions that will reproduce.
- Crossover: Mixes parents genotype.
- Mutation: Type and frequency variation applied to genotypes.
| Type | Genotype | Crossover | Mutation |
|---|---|---|---|
| Genetic Algorithm | Binary string | Swap portions of string | Random bit-flip |
| Genetic Programming | Program tree | Swap subtrees | Random symbol changes in tree |
| Evolutionarny Strategies | String of reals | Not Used | Draw from Gaussian distribution |
They are easy to parallelize but slower than gradient descent when gradient is known and the problem is simple.
2. Genetic Algorithms
To define a genetic algorithm, we need to define:
- Fitness function represents problem to solve. Maximising this function should lead to the optimal solution.
- Genotype & Phenotype represent the potential solutions to the problem. This is fed into the fitness function (e.g. a binary string).
- Selection selects parents for the next generation. For example, a biased roulette wheel selection (selecting fitter individuals more often).
- Crossover combines parents to create offspring. For example, one-point crossover (randomly split, and swap both parts).
- Mutation explores nearby mutation. For example, randomly flipping bits. We could add specific mutations to the problem.
- Stopping Criterion are usually specific fitness value, maximum generations, or convergence.
2.1 Biased Roulette Wheel Selection
The probability of selecting individual
2.2 Tournament Selection
- Randomly draw two individuals from the population.
- Select the one with higher fitness as a parent.
- Repeat until enough parents are selected.
This method is less susceptible to premature convergence, and easier to parallelise.
2.3 Elitism
Elitism ensures the best individuals are carried over to the next generation without alteration. This prevents losing the best solutions due to random chance. This is usually fixed to
3. Evolutionary Strategies
Here, the genotype is a list of reals, parent selection is uniform and mutation is generated from a gaussian.
3.1 (μ + λ) ES
- Randomly generate
individuals. - Evaluate population.
- Select
best individuals as parents. - Generate
offsprings where and is a randomly selected parent. - The new population is
. - Repeat from (2).
Usually
. The main challenge of μ + λ is choosing .
- A large sigma would quickly converge but be hard to refine.
- A small sigma would refine well, but take longer to converge and get stuck in local optima.
We could update sigma over time by adding
to the genotype: Where
is the learning rate. Heuristically, where .
4. Novelty Search
A novelty search algorithm uses an archive to store previously seen behaviours. The novelty of an individual is measured by its average distance to its
For the distance metric, we define a space of behavioral descriptors, which capture important aspects of the phenotype.
For example, in a robot maze navigation task, the behavioral descriptor could be the final position of the robot. This is not directly related to the fitness (e.g. reaching the goal), but captures the essence of the behavior.
Instead of optimising the quality of the solution, NS optimises the novelty. We do this by using novelty score instead of fitness for evaluation.
5. Quality-Diversity Optimisation
We want both diverse and high quality solutions. QD algorithms follow these steps:
- A stochasitic selection is taken from the collection of solutions. Usually uniform, but can be biased towards quality/novelty.
- Offspring are generated via mutation/crossover.
- Each offspring is evaluated for quality (fitness) and behavioral descriptor.
- The offspring is added to the collection if it is either:
- More novel than existing solutions (based on a novelty threshold).
- Higher quality than the existing solution in the same behavioral niche.
5.1 MAP-Elites
Multidimensional Archive of Phenotypic Elites is a simple QD algorithm to discretise the behavioral descriptor space in a grid and try to fill it with the best solutions. Each new solution fills a cell corresponding to the behavioral descriptor. If the cell is already occupied, it replaces the existing solution if it has higher fitness.
It takes as a hyperparameter the size of the grid. Its easy to implement but density is not necessarily uniform.
5.2 Quantifying Performance
We can find the diversity (archive size) and performance (max or mean fitness value), as well as the convergence speed of these two points. We can summarize this in a QD-score (the sum of the fitness of all solutions in the archive).
The tradeoff between these can be represented in a Pareto front.