Evolutionary Algorithms

Computer scientist Joseph Brown on Darwinian evolution, genetic algorithms, and the zombie apocalypse

faq | August 13, 2016

An evolutionary algorithm (EA) is an overarching title for the approaches in computer science that try to use the principles of Darwinian evolution in order to come up with solutions to problems. Darwin posited that organisms, specifically species, arise and develop via a process of natural selection of small inherited changes in traits. These natural selections increase the ability of the organism to survive, compete with other organism for limited resources, and reproduce. It explains the variation and development over time of multiple species, with finite populations, interacting in a world with limited resources.

Evolutionary algorithms aim to mimic the biological evolution in a digital analogue of it. However, despite of being called evolutionary algorithms, they are more similar to artificial breeding– where several generations of solutions are combined together in order to grew something that is better. One can also say that evolutionary algorithm is a method to work on a problem that you do not know much about.

History of Development

You are looking around 1970-s, late 1960-s. There were four major flavors of evolutionary algorithms, which at that point were all doing the same sort of thing: take a problem, find a fitness function against it, breed the best solutions together and come up with the next generation of problems. They were very much separate from each other and they all did it in slightly different ways.

There was Lawrence Fogel who did evolutionary programming, John Holland who did genetic algorithms, John Koza who did genetic programming, and Hans-Paul Schwefel who did evolutionary strategies. Those four were the founders of these different branches. There was actually a big fight for the precedent of who came up with evolutionary approach first and the fight was not over until this overarching title of evolutionary algorithms was put on top of it, bringing it all together.

Computer scientist Thomas Schmickl on bio-inspired robotics, virtual embryogenesis, and symbiotic multi-robot organisms
The base of the algorithm is rather similar but the major differences were on how the problem was represented. Originally the genetic algorithm was only working on binary strings. Genetic programming was only working with tree representations. As the researches started to work on the applications more intensively, they realized that they were all doing the same process and this was just one big algorithm. The evolutionary algorithms summed up that originally was just an application.

The development was rather slow and continuous. There have been a few breakthroughs, mostly in a way when representations were realized to be way more important – how you represent your problem, will give you better or worse results.

The way to put it is that EA is a framework of attacking a problem. Another buzz word in computer science is “Neural networks”. You can say that neural networks would be a representation that you could use and EA would be an approach you can use to train the parameters of neural networks.

Solving Problems with EA

You start with some problem – it is usually an optimization problem. There is a thing to be optimized – for example an aircraft wing with a bunch of positions across it. We first have to come up with what our structure is that represents the problem. Solution to this aircraft wing would be strict set of values. We start with a random population of wings – a bunch of different wings. We also have a way of measuring the ability of a wing to do its job, its property. In our example this property is lift, which should be maximized. Then our fitness function is based on this value.

Next step is to go through the starting population and evaluate all wings based on a fitness function. After all wings have been scored, the winning piece comes in – one has to select best wing based on predefined rules which wing is good enough to go on to the next stage. The best wings will be selected, the worst removed. For example, two wings are selected for breeding and two random values in the population are also selected: the one who has better X will be parent #1, who has better Y will be parent #2. After the selection is done, variation operators are applied.

There are two major types of variation operators. One is a binary sibling of two parents – you take a little bit of one wing and you take a little bit of the other. This is known as a crossover. Crossover is generally exploratory. Just like in a DNA we have half of our traits coming from our mother and half from our father. The crossover operator could be designed differently, not only 50-50%, it all depends on the representation aspects.

The second variation operator only applies to one of them and is called mutation. Mutation allows making very small changes. You will go through population and a mutation operator will make a very small random change in the string of numbers. Although the change is random, it is based upon a distribution. The operator rolls a random number based on a Gaussian normal distribution curve and a small Gaussian curve, then the number is added to that value (e.g. the position on the wing). You are making very small adjustment changes to it.
After this whole new generation has been bred, you will go and do this again and again and again. Because the selection operator (which selects those ‘chromosomes’ which go on to breed based upon an evaluation of the fitness of the ‘chromosome’) is putting pressure to make fitness value higher, over time the generations will have this value better and better.

The number of rounds depends on type of the problem. Normally there will be initially a rather high increase in fitness and eventually it will level-out. It will be run it until the fitness value will stop changing. There is an asymptotic level at the top and it is better to stop when fitness value hits that asymptote.
Of course the evolution algorithm does not always produce the same result. The algorithm should be run a few different times in the end producing not a single solution, but multiple different solutions to select from.

Selection Mechanisms

There are a few general techniques that are used for defining the best fitness. The one is a tournament selection, and this is similar to a practicum tournament – there is a scoop of the population of which everyone has the highest fitness, those are the ones that go on. Another known technique is rule-out selection – here you give a probability of being selected relative to the fitness of the creature. Let us say there are five wings and the first wing scores 3, the second scores 7. The first will have 3 chances to go on, and the second 7 chances respectively.

How big is enough to start the evolutionary algorithm? Of course it depends on the problem. There are some that use smaller populations, and others, with a rather empirical set. Often it is a number around 50-100. If a population is not large enough, you end up with less diversity.

There are two competing factors, the one is exploration – you want to find different areas in the space, you want to make new and different aircraft wings. And there is exploitation – you do not want to give up the good wings that you already have, but you also want to search in the areas that have not seen before. Exploitation is about making some small adjustments in order to be able to find something that is a little bit better. Large populations favor the ability to explore whereas smaller populations would force it to be more exploitative.

Requirements to Launch the EA

There should be a way of describing what a good wing is in order to do the evaluation. One has to know what a good wing is, but one does not have to know how to construct a good wing. For example, one can say:“a good research article/piece of writing is the one that has no spelling mistakes”, and he can measure a number of spelling mistakes in a document. But, he might not be a very good speller. Even if he knows what a good spelling is, he does not know how to get to it. The property of being able to say– “one can define what a good thing is but one does not know what the exact values are” – is very important.

There is a lot of application of EA in the computer generated art or in video games level design. One might not know how to write a good-looking level, but one can put certain rules, answering that. EA will meet with the criteria and it will do it in a way that one might not be able to do or might not think of.
The biggest problem is a definition of what is a good thing. This is a hardest part for a lot of things. How does one know what is good? This is partly based on an assumption that good things look like other good things – and for the majority of problems this assumption holds.

Link Between EA and Natural Selection

“Survival of the fittest” is not that attributable to Darwin. Herbert Spencer defined this phrase. Fitness, biologically, is defined as the survival of one’s children and the ability to give birth to the next generation. It is a post-priori idea of what fitness is. Whereas, the digital version of fitness is what your chances of breeding the next generation are based on a pre priori. It is more a cartoon of biology, not exactly Darwinian model, in order to have the same sort of element of selection. EA would be closer to how a farmer breeds a cow or makes a new plant compared to what happens in nature.

Biologist Arkhat Abzhanov on principal component analysis, molecular-based phylogenetic tree, and zebrafish embryos
The other difference is that in EA we are only looking at the evolution of one species, there is no interaction. In natural selection there are predators killing offsprings or all different kinds of systems.

Applications Evolutionary Algorithms

EA are mostly used in optimization problems and they are usually used in the optimization problems where we do not have an idea of a closed form solution, such as an equation computable by taking a derivative based on a known gradient, it might be a complex network or non-linear problem. They are all about going to the state that one does not quite understand and being able to get a good idea of a solution in that state.

These optimization problems are all across different fields. A lot of problems can be rewritten as optimization problems. The simplest example would be the level design – why is that an optimization problem? One has to create an interesting design. What is an interesting design? An interesting design is the one that has some particular properties. And if one can come up with something to test if the level has these properties or not, then it is optimization issue.

A lot of work is done in the area that is called evolutionary art – this deals with understanding of what is a good esthetically looking image. Of course there were several ups and downs in definition of fitness, but it is hard to say what a good painting is. How would you say if something is a good painting or not? If one is asked “do you like this painting or not” one can give answer but may not know if this is a good painting or WHY he likes this painting. It is a very subjective evaluator – a created bunch of new images is similar to those where the expert said “like”. The algorithm tries to develop a landscape of why an expert is enjoying this art. This is what Netflix does – it takes all movies that a person watched and gives a recommendation of what would be a next movie that person would want to watch. It has some sort of criteria that one is giving to it – these are good movies. Netflix does not necessarily use EA in their recommendation mechanisms, but one might use EA for something like that.

Evolutionary art
Creation of an image by an evolutionary algorithm.
Image credit: Mario Klingemann//Flickr

Circuit Design with EA

The biggest problem that received a breakthrough thanks to EA was the circuit design, in computer layer. At a circuit board it is forbidden to have any crossing wires because then another layer of silicone should be put atop. Additionally the distance between circuits should be minimized: If the distances on your board get larger, your computer gets slower. This comes out of the physical implementation the electron transmitting the signal from point B to point A is covering certain distance. Thus you want to have a chip size smaller. The smaller the distance, the faster your computer is. And GA right now holds the world record on chip design problem.

Every year there is an award ceremony – the awards are given for the works, where evolutionary-based solution has the same or better results as human-based solution. They have several criteria and several types of awards, you can see the full list here http://www.human-competitive.org/awards

Zombie Apocalypse Model with EA

The study of zombie epidemic models comes from pedagogical problems seen in research of diseases. Among the issues we have in teaching of such models: 1) students, their relatives, or friends might suffer from a disease which we want to explore, leading to them not being able to make objective action due to emotions, and 2) it may be politically/spiritually taboo to talk about certain diseases in a culture, such as Sexually Transmitted Infections, and 3) it can cause unnecessary fear by taking a simple model of a disease and applying it to a real outbreak; everyone has probably seen some headline saying “Disease X projected to infect large population” which didn’t come to pass. By using the zombie disease, the problem of framing is avoided while still having the ability to teach the skills necessary. You can see such an idea used in the 3 day planning materials created by the US CDC and Canada Health which used zombie outbreaks as an example of a disaster which homes could be prepared for rather than using a more likely flood, tornado, or terror attacks. Further, it introduces fun to the process of learning.

The model used in our research is a variation of the SIR model, SIR stands for Susceptible, Infected, Removed – the three categories a population of humans can be classified under. Susceptible are those who have not been infected. Infected are those who have the disease currently. Removed are those who have contracted the disease and are no longer infected or infectious, usually this is another category for dead. The model has discrete time steps in which the population of the Susceptible contracts the disease based upon the number of Infected and an infection rate and the Infected are Removed based on a removal rate. Various additions have been made to this model in the cases were someone can be reinfected, or they are in dormant periods of infection, etc. One such model is the SIZR model, which introduces a state known as “Zombie” which takes the place of the infectious role.

Computer scientist Matthew Bass on dynamic environment, multitenancy, and coordination between releases
In the research done by Professor Joseph Brown and colleagues they looked as how government actions, such as building a vaccination, handing out weapons to the population, or quarantine those possibly infected, could slow a zombie attack in a city. The government on each time-step can implement one policy action, as it has limited resources. Each policy action changes slightly the parameters of the simulation of the outbreak and there are multiple actions and multiple rounds. Then the strings of policies were evolved and the outbreak was simulated with the epidemic model. The findings were that depending on the zombie type the government should introduce a policy change. Slow zombies allow time for, so actions such as quarantines become effective tools to manage the outbreak. More aggressive zombies have only one solution – shooting them in the head, so one needs to supply more arms to the healthy populations. The models were able to show when this split between policies should occur and provide a tailor made approach to a particular type in a short period of time. Further, the period of dormancy of the zombie virus was examined, and it was shown that if the virus had a longer gestation period, so a longer time before one who was bit would turn into a zombie, it would be a more dangerous problem. It reduced the number of available healthy survivors who could help defend from the zombies, while not allowing the healthy survivors to kill them as zombies. About three days of dormancy caused the largest outbreaks – which happens to be the amount of dormancy in a number of media portrayals of zombie outbreaks. Though three days is just a coincidence as the media came out long before such modeling.

Master Theorem

One of the biggest questions of the field is the possibility of existence of the master theorem. The master theorem is how to use the correct rates – what is the proper population rate, what is mutation rate, crossover rate, etc. There is a lot of work for setting those values empirically. And if there is some sort of way how to know these values – this is something. Of course there is no free lunch – which gets really in the way of this. As with the exploration – there is no one good search method, because if one is searching in a better way, one is doing it on some set of problems but not on all of them.

Of course for EA there is no free lunch as well. In certain areas EA works much better, in other areas it has to do much worse. There is no search algorithm that will always do better than the other. To put it another way if you say “always go right” and the path distribution is 50/50, than half of the cases you are right, half – not.

Edited by Ksenia Vinogradova

Assistant Professor, Head of Artificial Intelligence in Game Development Lab, Institute of Information Systems, Innopolis University
Did you like it? Share it with your friends!
    Published items
    0557
    To be published soon
    +80

    Most viewed

  • 1
    Tim Reynolds
  • 2
    Simon White
  • 3
    Sophie Scott
  • 4
    Greg Towers
  • 5
    Tristram Wyatt
  • 6
    Joanna D. Haigh
  • 7
    Michael Thomas
  • 8
    Martin Kemp
  • 9
    David Rubinsztein
  • New