Optimisation with a Genetic Algorithm

Specifying the GA keyword in the data file starts optimisation using a ``Lamarckian'' genetic algorithm[#!OakleyWJ11Energy!#] (based on the Birmingham Cluster Genetic Algorithm[#!Johnston03!#]) instead of Basin-Hopping Monte Carlo. The genetic algorithm requires a coords file to run, but currently this is only used to set the number of atoms in the system and the initial structure is discarded. The output from the GA is added to the GMIN_out file. Running
grep GA GMIN_out
will return only the output from the GA operations. The file lowest contains the population of solutions from the final generation of the GA.

A random population of structures is generated at the start of each search and at the start of each new epoch. The GA will try to choose an appropriate method to generate the initial stuctures, but this can be overridden with the GAINITCHAIN or GAINITSPHERE keywords.

By default, tournament selection is used to choose the parent structures for crossover, but this can be replaced by roulette wheel selecion using the GASELROUL keyword. The GA will select an appropriate crossover method for the system being studied. Currently, the default settings are one-point crossover of the backbone dihedral andles in proteins and two-point cut-and-splice crossover for clusters. This is an elitist genetic algorithm, with the parent structures in each generation being the fittest of the parents, offspring and mutants in the previous generation. Therefore the mean energy of the population should always decrease or remain unchanged after each generation.

The duplicate predator (GADUPPRED) and epoch (GAEPOCH) operators are designed to deal with premature convergence of the GA. The duplicate predator removes any duplicate structures from the population and replaces them with the next best structure from the current generation. The epoch operator replaces the entire population with a set of new random structures if the population stagnates.

The it GA keyword has been tested for optimisation of BLN model proteins, single-component clusters and binary clusters.

David Wales 2017-09-21