I’ll be assuming that you know what these terms mean:
- Cost/objective function
- Local/global optima
If you don’t, that’s what Wikipedia and weekends are for.
What Is It?
Genetic algorithms are an optimization technique loosely based on the ideas of evolution and mutation. They’re a metaheuristic, and can used as a generic framework for solving all sorts of problems.
They’re also stupid simple, and I’ll try to prove that to you in this post.
The goal is to find the optimal solution to a problem, or one that’s close to it.
I’ll be using a lot of terminology, but it’ll all be defined properly later. This section is just to get the basic idea.
To start, we generate a bunch of candidate solutions to the problem. Each candidate is called an individual in the spirit of biology. The set of all individuals is the population.
Every iteration of our algorithm (a.k.a each generation), we evaluate the fitness of each individual and keep some amount of the fittest individuals. The rest are culled, except for a few that we randomly select to introduce some diversity that will help us find the global optima.
We create the next generation by breeding the survivors to create children.
Our next big inspiration from biology is mutation. Some fraction of the children are mutated at birth to introduce more diversity in a population.
We keep on doing this until the average fitness of the population has converged to something or we’re bored.
Why Should You Care?
This technique is worth learning if you care about optimizing things, because:
- Can be run in parallel
- Easily distributed
- Doesn’t require fancy tuning for the problem at hand
- Derivative-free, i.e. can be used for problems with lots of discontinuities
- Generic, easily adapted to new problems with minimal tweaking
How To Implement?
Terminology
We start here, since a lot of the barrier for learning these (for me) was the unfamiliar terminology.
- Individual
- A candidate solution for the problem.
- Population
- A set of individuals.
- Fitness Function
-
A function that takes in an individual and returns a real number representing how good a solution the individual is.
This is is usually maximized, but we’ll be minimizing it. So “fitness” is a misnomer, and it’s really a cost function as we’ll be using it.
The Process
Lots of the following steps are independent of the specific problem you’re trying to solve, and I’ll mark the ones that are generic to illustrate that.
To make this more concrete, we’ll be trying to optimize few benchmark functions.
I picked the Rastrigin function because it was first on the list and was easy to type in Python.
We optimize in the interval between \([-5.12,5.12]\) because Wikipedia told me to.
def rastrigin(x, A=10):
return 10 * len(x) + sum((i ** 2 - A * np.cos(2 * np.pi * i)) for i in x)
def sphere(x):
return sum(i ** 2 for i in x)
def booth_fn(x):
return (x[0] + 2 * x[1] - 7) ** 2 + (2 * x[0] + x[1] - 5) ** 2
f = rastrigin
Define the Individual
This is problem-specific. It should be a candidate solution to your problem.
In our case, we just want points in \(N\)-dimensional space, so we just use an array of length \(N\). We’re just going to work in 1-dimensional space to make the computation really fast, but feel free to try it in higher dimensions.
def individual(bound: float=BOUND) -> Individual:
return np.random.uniform(low=-bound, high=bound, size=N)
Define Fitness Function
We’ll be calling this fitness instead of loss for the sake of consistency, but we’re going to be minimizing it.
In our case this is simple. It’s just the objective function we’re trying to minimize, the Rastrigin function.
def rastrigin(x, A=10):
return 10 * len(x) + sum((i ** 2 - A * np.cos(2 * np.pi * i)) for i in x)
f = rastrigin
def fitness(individual: Individual) -> Real:
return f(individual)
Create Population
This is barely problem-specific. You just create some number of individuals. I feel like you could write a more abstract function that makes this generic, but it’s easy to do yourself.
def create_population() -> Population:
return [individual() for _ in range(POPULATION_SIZE)]
Find Average Fitness Of Population (generic)
This is literally just the mean fitness over the population.
def avg_fitness(population):
return sum(fitness(individual) for individual in population) / len(population)
Cull (generic)
Now we get to the law of the jungle. Survival of the fittest. Or rather, death of the un-fittest.
We sort our population by fitness and keep SURVIVAL_RATE
percent of
them. A few lucky individuals also get spared for the sake of genetic
diversity.
def cull(population) -> Population:
population.sort(key=fitness)
split_idx = ceil(SURVIVAL_RATE * len(population))
# they survive
fittest = population[:split_idx]
reprieved = random.sample(
population[split_idx:], floor(REPRIEVE_RATE * len(population[split_idx:])))
return fittest + reprieved
Breed and Mutate
Breed
The fun (and tricky part): how to combine parents to get a child.
This is extremely problem-specific, and clever breeding (and mutation) will make the difference between success and failure. Figuring out good strategies comes with practice and understanding the problem domain.
I tried 4 different strategies and the one that worked best was not the first one I thought of.
But for us, we’re going to us a really simple strategy. We just pick the fittest individual out of a random sample and copy it.
Mutate
Mutation is also problem-specific, and as important as breeding. In our
case, mutation will be the main method of solving the problem. We’ll
perturb each array by some number EPSILON
, and set the mutation rate
high so that this happens often.
We use np.clip
to avoid escaping the area we’re optimizing over.
def breed(parents: Population, mutation_rate: float=MUTATION_RATE) -> Individual:
child = min(
random.sample(parents, NUM_PARENTS),
key=fitness, )
if random.random() < MUTATION_RATE:
child += np.clip(
np.random.uniform(low=-EPSILON, high=EPSILON, size=N),
a_min=-BOUND,
a_max=BOUND, )
return child
Repeat (generic)
Now it all comes together. We repeatedly cull and breed until our average fitness is close enough to optimal or we decide to stop after some number of iterations.
The key line is population = create_new_generation(cull(population))
,
where we cull and then create a new generation from the survivors.
def create_new_generation(population) -> Population:
num_children = POPULATION_SIZE - len(population)
num_parents = min(len(population), NUM_PARENTS)
population += [breed(random.sample(population, k=num_parents)) for _ in range(num_parents)]
return population
def evolve():
population = create_population()
FITNESS = avg_fitness(population)
iter = 0
while FITNESS > 1e-7 and iter < MAX_ITERS:
population = create_new_generation(cull(population))
FITNESS = avg_fitness(population)
iter += 1
if iter % 1000 == 0:
best = min(population, key=fitness)
print('--------------------------------------------------------------')
print(f'FITNESS = {FITNESS}, iter = {iter}, BEST = {fitness(best)}')
print(f'iter = {iter}')
best = min(population, key=fitness)
return best
How Well Does It Do?
In the case of our problem setup, it tends to do fairly well. Often it gets stuck at a local minima of about .995, but equally often it finds the global minima in anywhere from 2 (!!) to a few thousand iterations.
Tweaking hyperparameters
You sort of have to play around with parameters like population size, survival rate, mutation rate, and reprieve rate.
You could probably write another genetic algorithm to tweak the hyperparameters of your genetic algorithm. I’ll leave that as an exercise.
Takeaway
Hopefully genetic algorithms will stick in your head long enough for you to find a use for them. When that day comes, read this post again. Soon enough, you’ll be able to make genetic algorithms a part of your toolbox.