Text10ING
Argomenti trattati: algoritmi genetici, reti neurali
"NEUROCOMPUTING 6: GENETIC ALGORITHMS"
di Simon D. Levy
Testo tratto dalla rivista "Extropy", #9, vol.4, n.1, Summer 1992, Los Angeles, CA, USA
Imagine the following scenario: You are a traveling salesman who must make a tour of a large number of cities that are interconnected by a network of roads. You need visit each city only once. To save yourself time and gas money, you want to take the most efficient route possible, spending as little time as you can on the road. What kind of decision procedure can you use to come up with the best travel plan?
This scenario, commonly called the "traveling salesman problem," is one of a class of mathematical puzzles known as optimization problems. For the traveling salesman problem, as for many other optimization problems, there is no general algorithm that will guarantee you the best solution in a reasonable amount of time. Various methods have been explored, with different degrees of success. For example, you might simply measure every possible route through all the cities, and then pick the one that covered the least distance. Of course, this "brute force" solution would become impracticable for anything more than a handful of cities, because the number of possible routes increases very rapidly as you add more cities. Parallelizing the problem by getting a bunch of computers to try one route each, and then comparing the results, is one way to improve the usefulness of this approach.
A second, more common approach involves gradient descent along an "error surface" generated by a particular solution and its neighbors. The idea is to take the error of the current solution and compare it to the error of nearby (i.e., similar) solutions, moving to the neighbor with the lowest error. Gradient descent is a fundamental technique in neural network algorithms, especially back propagation. See my Neurocomputing 3 article in Extropy 6, and references therein, for more infommation.
Now consider how Nature handles such problems. You can viewthe traveling salesman's task as a kind of ecological niche, so that a good solution is an organism that can perform the task whhout dying of exhaustion or losing out to more successful competitors. Unlike most computer algorithms, though, the solutions developed by Nature are not the result of deliberate planning by a conscious agent. Instead, Nature relies on random mutation and natural selection, as described originally by Charles Darwin in The Origin of Species, and more recently by Richard Dawkins in The Blind Watchmaker (reviewed in this issue). Good solutions arise by mutation - accidental, minor changes in genes - and are favoredby the environment, allowing the organism bearing the mutant genes to survive, reproduce, and pass these genes onto its offspring. Sexual reproduction, in which an offspring gets half its genes from one organism and half from another, ensures that good solutions will get a chance to combine, producing offspring that may be better than either parent.
Genetic algorithms, first developed by John Holland at the University of Michigan, exploit this cycle of mutation, sex, and natural selection in an attempt to arrive at solutions to optimization problems. The general procedure is delight fully simple and can be described by the following steps:
(1) Start with a set of possible solutions to your problem. If you have no idea of how a good solution would look, just generate these first solutions randomly.
(2) Take each possible solution, apply it to the problem, and examine the results. If you're satisfied with some solution(s), quit here. Otherwise, go on to step (3).
(3) Based on some previously established criterion, reject solutions that fall below a certain level of success at solving the problem.
(4) Create new solutions by splicing together a parts of one successful solution with parts of another.
(5) Every now and then, "mutate" (randomize) some part of some solution.
(6) Go to (2).
As usual, I think it's a good idea to illustrate this procedure with an example. I don't have the paper (or the patience) necessary to fiddle with the traveling salesman problem, so I'll switch to something a little less complicated, namely, a neural network implementation of the Boolean XOR function, an example that l also used for Neurocomputing 3. This function takes two inputs, each of which may be zero or one, and outputs a one if the inputs are different. If the inputs are the same, the output is zero. In other words...
Input 1 0011
Input 2 0101
Output 0110
Now, an interesting problem is howto train a neural network to "become" this function. That is, given the following diagram, where l's are inputs, W's are multiplicative connection weights, sigmas are summations, and thetas are thresholds, we wish to obtain W's and thetas such that the input/output relations of the function are all satisfied.
As readers of Extropy 6 will recall, one scheme for getting succcessful network parameters is back-propagation, whereby the network's actual output is compared with the desired output, and the difference between actual and desired output is propagated back through the network, to adjust the weights and thetas.
Of course, our present concern is genetic algorithms, so the question arises as to whether we can "breed'' a network cell in an XORganism - to do the XOR problem. To examine this qu estion, I wrote a little C program that worked as follows. (As usual, this program is available to Extropy readers at no cost, by sending a letter to me care of the magazine.) Each organism was represented, naturally, by five weights and two thetas. These were the seven "genes" of the organism. "Mating" consisted of producing a new organism with half the genes of one parent and half the genes of another. Since organisms had seven genes each, the program flipped a coin to determine which parent the seventh gene came from. The program started with a collection of 100 organisms with random genes. For each generation, the program computed each organism's score on the XOR problem, and killed off any organism that got fewer that three out of four correct on the problem. Then, the program bred the resulting organisms according to the mating scheme just described. In addition, the program mutated (randomized) a randomly picked gene in some randomly picked organisms every few generations. Parameters of the program were (1 ) the number of matings per generation, (2) the minimum score necessary for survival, (3) the number of generations between mutations, and (4) the number of organisms to mutate in a mutating generation.
I didn't have time to explore possibilities that would result from fiddling with all the parameters, but some things about the XORganisms seemed pretty clear after several runs of 100 generations or so. First, the number of completely successful organisms (score = 4/4) was always much smaller than the number of organisms that got a score of 3/4. This result is what we expect, because the niche we defined was a score of 3/4, meaning that 4/4 scorers was over-achievers. Second, the emergence of these over-achieving "uber-organisms" seemed to depend a lot on initial conditions. If the initial random creation of 100 organisms produced one with a perfect score, that organism tended to reproduce itself, making lots of other over-achievers, but if no over-achiever existed from the start, chances were fairly good that none would emerge. Finally, adding mutations seemed to have a beneficial effect on the number of over-achievers produced. For example, I compared 10 runs of 100 generations, all with a single organism being mutated every generation, against 10 runs of 100 generations with no mutations at all. The average number of uber-organisms produced with mutations was around 32, whereas the number produced with no mutations was about 9.
Using genetic (or neural net) algorithms to model Boolean functions maybe overkill, like earning a Ph.D. in mathematics so that you can balance your checkbook. Nevertheless, the XORganism example is instructive because it shows how a simple neural net problem can be solved by the technique of genetic algorithms, thereby showing one way in which genetic algorithms fit into the rubric of neurocomputing. Genetic algorithms have also caught on for "real-world" applications. One usage, not surprisingly, is the solution of optimization problems similar to the traveling salesman problem described above. These include the design of aircraft and VLSI chips. Other applications include an image-recognizing program that looks at many sub-programs, linking together those that run best, and a program for studying the behavior of ants through simulation.
In the next installation of this column, I'll discuss a topic that seems to interest many members of the Extropians mailing list, namely, the idea of modeling economies computationally, or modeling computers economically.
SOURCES
Books
A.K. Dewdney. The Armchair Universe. NewYork: W.H. Freeman. 1988 (See especially the chapter entitled "The Evolution of Flibs")
J. Holland. Adaptation in Natural and Artificial Systems. Ann Arbor: University of Michigan Press. 1975.
Articles
M. Antonoff. "Genetic Algorithms: Software by Natural Selection." Popular Science, October 1991.
R. Morin. "A Look at Genetic Algorithms." SunExpert Magazine, November 1990.
P. Wayner. "Genetic Algorithms." BYTE, January 1991.