Typically the first program you write when you are learning a programming language is, "Hello World!". Due to the popularity of my prior post on optimizing the traveling salesman problem with genetic algorithms I decided to write a general parallelized Genetic Algorithm Framework for LabVIEW, naturally the first thing I taught it to do was to say, "Hello World!".
I have been interested in Machine Learning for a while now, we interact with it all the time in our day to day lives and those interactions will only grow in frequency with time. Genetic Algorithms (GA) are a subset of Machine Learning that I have found to be pretty interesting ever since I heard about how NASA used them to design an antenna. In my previous post I played around with using a GA to find a solution to the traveling salesman problem and it triggered a lot of activity on the blog, so I figured I would write a generic framework that could be applied to other problems easily. (Writing the framework wasn't entirely altruistic, I am also going to be using it for my NI Week 2016 presentation I am doing with Alan C. Smith).
It turns out that Douglas Adams was right when he wrote that the Earth was created as a super computer to answer the "Ultimate Question of Life, the Universe and Everything". Earth is a computer, and the program it is running is evolution. Evolution has given us unique and unexpected results to the problem of survival.
These unique evolutionary results in turn provide their own unique solutions...
Genetic Algorithms are designed to model this evolution of species as observed on earth. Because of this symmetry the main components of GAs are Genes, Chromosomes, and Populations, and the main functions are Mutation, Mating, and Evolution.
My Genetic Algorithm Framework (GAF) follows the paradigm. There is a Population class that handles a population of Chromosome objects, which in turn are made up of Gene objects. The Population class handles the algorithm itself, in so far as it dictates when mutations occur and when chromosomes mate to produce new chromosomes (this is called crossover). The Chromosome class handles the specific details regarding how mutation, mating, fitting, and construction work. The Gene class is the representation of the data being acted on. In the Traveling Salesman problem a Gene would represent a city, in the Hello World problem a Gene would represent an ASCII character.
Both the Population class and the Chromosome class are Actor Framework Actors, and communication of data between the Population, Chromosome classes and the UI is done using the Simple Mediator Framework I put together. These design decisions make the code a little more complex than it had to be but the cost of complexity is paid for by the benefits it provides. These benefits include the ability to run my chromosomes as Actors in parallel, the ability to run many populations as Actors in parallel, the separation of the UI from the Genetic Algorithm Framework using the mediator pattern, and the fact that this design better fits my needs for my NI Week presentation (kind of a cop out, but I told you I wasn't being entirely altruistic).
To use the framework you will need to create children of the Chromosome and Gene classes. The Chromosome class has four methods that must be overriden; "Calculate Fitness.vi", "Generate Random.vi", "Mate.vi", and "Mutate.vi". You will need to write code within each of these VIs that fits your problem domain.
Lets take a look at the Chromosome and Gene children I created for the Hello World! example. In the example the child of the Gene class is named, "Character" and its class data is a string.
The Character class has two methods named, "Compare Chars.vi" and "Construct Gene.vi".
The Construct Gene method will generate a random ASCII character between the values of 32 and 90. This defines the subset of genes that my chromosome can have. The Construct Gene method is called whenever I need to generate a random character that will be inserted into my String class.
The Compare Chars method takes in two Genes, which are ASCII characters in this example and compares them, assigning a fitness score to the comparison. If the characters are the same then the fitness score is zero, the score gets proportionally higher the further from a match the two characters are. With this implementation a lower fitness is a better match. Note that this fitness is just for the gene comparison, not for the chromosome as a whole.
The child of the Chromosome class is named String and it has four must override methods and one public method I created as a way to get the solution string into the algorithm. The class data for the String class holds an array of Character objects.
The Calculate Fitness.vi evaluates the chromosome's fitness, this evaluation is problem domain specific but the Genetic Algorithm Framework expects that the fitness of the chromosome increases the closer the fitness score gets to zero. In the Hello World example the fitness is calculated by comparing each Gene from the Chromosome to its corresponding Gene in the solution. The fitness returns as zero if the two genes match, and is otherwise weighted by proximity on the ASCII table. This method must call its parent method, and it must call it after the child's code has executed.
The Generate Random.vi will generate an array of random Genes for the Chromosome. This method is only used for the initial creation of the population. This VI must call its parent implementation, and the call must happen after the code in the child implementation executes.
The Mate.vi takes in two Chromosomes as inputs handles combining their Genes to create some number of children. The Hello World example calculates a random pivot point at which to slice the gene arrays and then recombines the first half of Gene one with the second half of Gene two, and the first half of Gene two with the second half of Gene one. The two new children are then passed out as two new Chromosomes to be launched as new Actors, while the original parent Chromosomes will eventually be put out to pasture.
The Mutate.vi defines how the Chromosome mutates in the event that random mutation is triggered. The mutation typically consists of the injection of a random Gene into a random location within the Chromosome, which is precisely what the Hello World example does.
The Chromosome class also has a static dispatch public method named Load Target. The purpose of this method within the context of the Hello World example is to get the solution string, "Hello World!" (or whatever else the user wants to optimize towards) into the Chromosome object. You may or may not need to have a method that serves this function, it all depends on the context of the problem you are trying to optimize.
The last piece of the puzzle is the top level VI. This VI has two staging areas, one for the setup and launching of the Population Actor, and the other for the registration for data and events for the UI using the Mediator Pattern. Notice that you didn't see any mention of the Mediator Pattern in the children of the Gene and Chromosome classes, its use within the framework is already coded up. You can, however, access it from within your child of the Chromosome class if you need to for custom UI design.
Zooming in on the staging area for the Population Actor you can see that this where the parameters fort he Genetic Algorithm are entered; Population Size, Crossover Rate, Elitism Rate, Mutation Rate, Tournament Size, and the total number of Generations. This is also where you will inject your child of the Chromosome class, here you can see the String object being wired into the "Chromosome Type" property. To run the framework against your custom solution the only thing you have to do is inject your child of the Chromosome class here.
Population Size: The Population Size is the number of chromosomes that you want to exist in your population for each evolutionary iteration. The framework will keep this many Chromosome Actors in memory throughout the duration of the run. The largest the number gets the slower the genetic algorithm runs.
Crossover Rate: The Crossover Rate is a number between 0 and 1 that defines the likelihood that two chromosomes will be selected to mate each evolutionary iteration. Setting this value to 1 guarantees mating each iteration.
Elitism Rate: The Elitism Rate is a number between 0 and 1 that defines what percentage of the population will persist between evolutionary iterations. This idea is kind of weird, its like saying that a parent's genes are so good that they become immortal and keep mating with each new generation, at least until the new generation catches up on gene quality at which point the immortal parent summarily done away with. A non-zero value here means that the best x percent of the current generation gets copied into the next generation.
Mutation Rate: The mutation rate is a number between 0 and 1 that defines the likelihood of a random gene mutation being injected into a chromosome. This is a tough number to get right; if it's too large then it becomes hard to bring a solution into focus, but if it is too small it becomes easy for the evolution to get trapped in a local minima.
Tournament Size: The Tournament Size is a number between 0 and the Population Size. When a chromosome is randomly selected for mating a tournament is held between n chromosomes, where n = Tournament Size. During the Tournament n random chromosomes are selected and fittest of the group is selected for mating. The Tournament is run twice to select two mates, and then they mate.
Generations: The Generations input is any number greater than zero and it dictates how many evolutionary iterations (or generations) the genetic algorithm should run for.
Zooming in on the Message Broker staging area we see that we are registering for three pieces of information; Fittest Score, Fittest Chromosome, and Generations. As the Genetic Algorithm runs the Population will broadcast these three data points, by registering for them here we can use them to update our user interface. If you add broadcast points to your Chromosome class you will need to subscribe to them here in order to pass them to the UI.
That is a quick introduction to the Genetic Algorithm Framework, you can see a video of it running above. You can download the VI Package for the framework on the Tools page, right now I am only supporting LabVIEW 2015 and higher, but not for any good reason. As a note, this package depends on the Simple Mediator Framework, which can also be downloaded on the Tools page.
If you want to see a heartier version of the framework be sure to attend my NI Week 2016 presentation, it is titled, "Unfunded Liabilities: How Poor Package Design Drives Technical Debt and What You Can Do About It" and is scheduled for Monday August 1st 2016 at 9:45 AM (which is Alliance Day).
Jon McBee is a Principal Software Engineer at Cambridge NanoTech and is a Certified LabVIEW Architect, Certified LabVIEW Embedded Developer, Certified TestStand Developer, an NI Certified Professional Instructor, a LabVIEW Champion, and a Certified ScrumMaster