Artificial Neural Networks
How artificial neural networks are trained and used in industry
An evolutionary algorithm (EA) is an overarching title for the approaches in computer science that try to use the principles of Darwinian evolution in order to come up with solutions to problems. Darwin posited that organisms, specifically species, arise and develop via a process of natural selection of small inherited changes in traits. These natural selections increase the ability of the organism to survive, compete with other organisms for limited resources, and reproduce. It explains the variation and development over time of multiple species with finite populations interacting in a world with limited resources.
Evolutionary algorithms aim to mimic biological evolution in a digital analog of it. However, despite being called evolutionary algorithms, they are more similar to artificial breeding– where several generations of solutions are combined together in order to grow something that is better. One can also say that an evolutionary algorithm is a method to work on a problem that you do not know much about.
You are looking around the 1970-s and late 1960-s. There were four major flavors of evolutionary algorithms, which at that point were all doing the same sort of thing: take a problem, find a fitness function against it, breed the best solutions together, and come up with the next generation of problems. They were very much separate from each other, and they all did it in slightly different ways.
There was Lawrence Fogel, who did evolutionary programming; John Holland, who did genetic algorithms; John Koza, who did genetic programming; and Hans-Paul Schwefel, who did evolutionary strategies. Those four were the founders of these different branches. There was actually a big fight for the precedent of who came up with the evolutionary approach first, and the fight was not over until this overarching title of evolutionary algorithms was put on top of it, bringing it all together.
The base of the algorithm is rather similar, but the major differences were in how the problem was represented. Originally the genetic algorithm was only working on binary strings. Genetic programming was only working with tree representations. As the researchers started to work on the applications more intensively, they realized that they were all doing the same process and that this was just one big algorithm. The evolutionary algorithms summed up that originally was just an application.
The development was rather slow and continuous. There have been a few breakthroughs, mostly in a way when representations were realized to be way more important – how you represent your problem will give you better or worse results.
The way to put it is that EA is a framework for attacking a problem. Another buzzword in computer science is “Neural networks”. You can say that neural networks would be a representation that you could use, and EA would be an approach you can use to train the parameters of neural networks.
You start with some problem – it is usually an optimization problem. There is a thing to be optimized – for example, an aircraft wing with a bunch of positions across it. We first have to come up with what our structure is that represents the problem. The solution to this aircraft wing would be a strict set of values. We start with a random population of wings – a bunch of different wings. We also have a way of measuring the ability of a wing to do its job, its property. In our example, this property is a lift, which should be maximized. Then our fitness function is based on this value.
The next step is to go through the starting population and evaluate all wings based on a fitness function. After all, the wings have been scored; the winning piece comes in – one has to select the best wing based on predefined rules, which wing is good enough to go on to the next stage. The best wings will be selected, and the worst removed. For example, two wings are selected for breeding, and two random values in the population are also selected: the one who has better X will be parent #1, and who has better Y will be parent #2. After the selection is made, variation operators are applied.
There are two major types of variation operators. One is a binary sibling of two parents – you take a little bit of one wing, and you take a little bit of the other. This is known as a crossover. Crossover is generally exploratory. Just like in DNA, we have half of our traits come from our mother and half from our father. The crossover operator could be designed differently, not only 50-50%; it all depends on the representation aspects.
The second variation operator only applies to one of them and is called a mutation. Mutation allows making very small changes. You will go through a population, and a mutation operator will make a very small random change in the string of numbers. Although the change is random, it is based upon a distribution. The operator rolls a random number based on a Gaussian normal distribution curve and a small Gaussian curve, then the number is added to that value (e.g., the position on the wing). You are making very small adjustment changes to it.
After this whole new generation has been bred, you will go and do this again and again and again. Because the selection operator (which selects those ‘chromosomes’ which go on to breed based upon an evaluation of the fitness of the ‘chromosome’) is putting pressure to make fitness value higher, over time, the generations will have this value better and better.
The number of rounds depends on the type of problem. Normally there will be initially a rather high increase in fitness, and eventually, it will level out. It will be run until the fitness value stops changing. There is an asymptotic level at the top, and it is better to stop when the fitness value hits that asymptote.
Of course, the evolution algorithm does not always produce the same result. The algorithm should be run a few different times, in the end, producing not a single solution but multiple different solutions to select from.
There are a few general techniques that are used for defining the best fitness. The one is a tournament selection, and this is similar to a practicum tournament – there is a scoop of the population of which everyone has the highest fitness, and those are the ones that go on. Another known technique is a rule-out selection – here, you give a probability of being selected relative to the fitness of the creature. Let us say there are five wings, the first wing scores 3, and the second scores 7. The first will have three chances to go on, and the second 7 chances, respectively.
How big is enough to start the evolutionary algorithm? Of course, it depends on the problem. There are some that use smaller populations, and others, with a rather empirical set. Often it is a number around 50-100. If a population is not large enough, you end up with less diversity.
There are two competing factors; the one is exploration – you want to find different areas in space, and you want to make new and different aircraft wings. And there is exploitation – you do not want to give up the good wings that you already have, but you also want to search in the areas that you have not seen before. Exploitation is about making some small adjustments in order to be able to find something that is a little bit better. Large populations favor the ability to explore, whereas smaller populations would force it to be more exploitative.
There should be a way of describing what a good wing is in order to do the evaluation. One has to know what a good wing is, but one does not have to know how to construct a good wing. For example, one can say: “A good research article/piece of writing is the one that has no spelling mistakes”, and he can measure the number of spelling mistakes in a document. But he might not be a very good speller. Even if he knows what good spelling is, he does not know how to get to it. The property of being able to say– “one can define what a good thing is, but one does not know what the exact values are,” – is very important.
There is a lot of applications of EA in computer-generated art or in video games level design. One might not know how to write a good-looking level, but one can put certain rules and answer that. EA will meet the criteria, and it will do it in a way that one might not be able to do or might not think of.
The biggest problem is a definition of what is a good thing. This is the hardest part of a lot of things. How does one know what is good? This is partly based on the assumption that good things look like other good things – and for the majority of problems, this assumption holds.
“Survival of the fittest” is not that attributable to Darwin. Herbert Spencer defined this phrase. Fitness, biologically, is defined as the survival of one’s children and the ability to give birth to the next generation. It is a post-priori idea of what fitness is. Whereas the digital version of fitness is what your chances of breeding the next generation are based on a pre-priori. It is more a cartoon of biology, not exactly a Darwinian model, in order to have the same sort of element of selection. EA would be closer to how a farmer breeds a cow or makes a new plant compared to what happens in nature.
The other difference is that in EA, we are only looking at the evolution of one species; there is no interaction. In natural selection, there are predators killing offspring or all different kinds of systems.
EA is mostly used in optimization problems, and they are usually used in optimization problems where we do not have an idea of a closed-form solution, such as an equation computable by taking a derivative based on a known gradient; it might be a complex network or non-linear problem. They are all about going to the state that one does not quite understand and being able to get a good idea of a solution in that state.
These optimization problems are all across different fields. A lot of problems can be rewritten as optimization problems. The simplest example would be the level design – why is that an optimization problem? One has to create an interesting design. What is an interesting design? An interesting design is one that has some particular properties. And if one can come up with something to test if the level has these properties or not, then it is an optimization issue.
A lot of work is done in the area that is called evolutionary art – this deals with an understanding of what is a good esthetically looking image. Of course, there were several ups and downs in the definition of fitness, but it is hard to say what a good painting is. How would you say if something is a good painting or not? If one is asked, “Do you like this painting or not,” one can give an answer but may not know if this is a good painting or WHY he likes this painting. It is a very subjective evaluator – a created bunch of new images is similar to those where the expert said “like”. The algorithm tries to develop a landscape of why an expert is enjoying this art. This is what Netflix does – it takes all movies that a person watches and gives a recommendation of what would be the next movie that person would want to watch. It has some sort of criteria that one is giving to it – these are good movies. Netflix does not necessarily use EA in its recommendation mechanisms, but one might use EA for something like that.
Creation of an image by an evolutionary algorithm.
Image credit: Mario Klingemann//Flickr
The biggest problem that received a breakthrough thanks to EA was the circuit design in the computer layer. At a circuit board, it is forbidden to have any crossing wires because then another layer of silicone should be put atop. Additionally, the distance between circuits should be minimized: If the distances on your board get larger, your computer gets slower. This comes out of the physical implementation of the electron transmitting the signal from point B to point A, covering a certain distance. Thus you want to have a chip size smaller. The smaller the distance, the faster your computer is. And GA right now holds the world record on chip design problems.
Every year there is an award ceremony – the awards are given for the works where the evolutionary-based solution has the same or better results as the human-based solution. They have several criteria and several types of awards; you can see the full list here http://www.human-competitive.org/awards
The study of zombie epidemic models comes from pedagogical problems seen in the research on diseases. Among the issues we have in the teaching of such models: 1) students, their relatives, or friends might suffer from a disease that we want to explore, leading to them not being able to make objective action due to emotions, and 2) it may be politically/spiritually taboo to talk about certain diseases in a culture, such as Sexually Transmitted Infections, and 3) it can cause unnecessary fear by taking a simple model of a disease and applying it to a real outbreak; everyone has probably seen some headline saying “Disease X projected to infect large population” which didn’t come to pass. By using the zombie disease, the problem of framing is avoided while still having the ability to teach the skills necessary. You can see such an idea used in the 3-day planning materials created by the US CDC and Canada Health, which used zombie outbreaks as an example of a disaster that homes could be prepared for rather than using a more likely flood, tornado, or terror attack. Further, it introduces fun to the process of learning.
The model used in our research is a variation of the SIR model; SIR stands for Susceptible, Infected, and Removed – the three categories a population of humans can be classified. Susceptible are those who have not been infected. Infected are those who have the disease currently. Removed are those who have contracted the disease and are no longer infected or infectious; usually, this is another category for the dead. The model has discrete time steps in which the population of the Susceptible contracts the disease based upon the number of Infected and an infection rate, and the Infected are Removed based on a removal rate. Various additions have been made to this model in cases where someone can be reinfected, or they are in dormant periods of infection, etc. One such model is the SIZR model, which introduces a state known as “Zombie,” which takes the place of the infectious role.
In the research done by Professor Joseph Brown and colleagues, they looked at how government actions, such as building a vaccination, handing out weapons to the population, or quarantining those possibly infected, could slow a zombie attack in a city. The government on, each time step, can implement one policy action, as it has limited resources. Each policy action slightly changes the parameters of the simulation of the outbreak, and there are multiple actions and multiple rounds. Then the strings of policies were evolved, and the outbreak was simulated with the epidemic model. The findings were that depending on the zombie type, the government should introduce a policy change. Slow zombies allow time, so actions such as quarantines become effective tools for managing the outbreak. More aggressive zombies have only one solution – shooting them in the head, so one needs to supply more arms to healthy populations. The models were able to show when this split between policies should occur and provide a tailor-made approach to a particular type in a short period of time. Further, the period of dormancy of the zombie virus was examined, and it was shown that if the virus had a longer gestation period, so a long time before one who was bit would turn into a zombie, it would be a more dangerous problem. It reduced the number of available healthy survivors who could help defend themselves from the zombies while not allowing the healthy survivors to kill them as zombies. About three days of dormancy caused the largest outbreaks – which happens to be the amount of dormancy in a number of media portrayals of zombie outbreaks. However, three days are just a coincidence as the media came out long before such modeling.
One of the biggest questions in the field is the possibility of the existence of the master theorem. The master theorem is how to use the correct rates – what is the proper population rate, what is the mutation rate, crossover rate, etc.? There is a lot of work in setting those values empirically. And if there is some sort of way how to know these values – this is something. Of course, there is no free lunch – which gets really in the way of this. As with the exploration – there is no one good search method because if one is searching in a better way, one is doing it on some set of problems but not on all of them.
Of course, for EA, there is no free lunch as well. In certain areas, EA works much better; in other areas, it has to do much worse. There is no search algorithm that will always do better than the other. To put it another way, if you say “always go right” and the path distribution is 50/50, then in half of the cases, you are right, half–not.
Edited by Ksenia Vinogradova
How artificial neural networks are trained and used in industry
Harvard Professor Jeffrey Schnapp on print as a tremendously powerful medium, dynamic media forms, and the Wor...
Cloud Technology Researcher José Luis Vázquez-Poletti on data storage in computing clouds, creating new virtua...