;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; lgp2.lisp ;; a Linear Genetic Programming system ;; c) Lee Spector, 1999 ;; lspector@hampshire.edu ;; ;; version 1.19991206 (n.yyyymmdd) #| This is a linear-program-based steady-state genetic programming system in Common Lisp. It was written with several goals in mind, among them: - to support work on the evolution of quantum algorithms - to support introductory instruction in genetic algorithms/programming Features: - small, simple, and clear (I hope) - linear programs - steady state, after a fashion - lexicographic fitness - demes - easy to add a stack for stack-based genetic programming - easy to add "local search" operators - easy to enhance immigration/emigration for clustered computing To use for your own application: - load random.cl - load this file (lgp2.lisp) - define any application-specific variables and utilities - define an application-specific fitness function - provide an appropriate value for *gp-params* - evaluate (evolve) to run the system Miscellaneous notes: - Individuals are represented as lists of: (fitness-list program) - Fitness lists are lists of numbers, with smaller always indicating "better." Fitness list F1 is "better" than fitness list F2 if its first number is lower than that of F2. If the first numbers are equivalent then the comparison moves on to the second elements, etc. The number of elements in fitness lists should be the same throughout an application. - Selection-tournament considers only pre-computed fitness values, with un-evaluated individuals always winning. - For replacement tournaments, un-evaluated individuals are handled randomly. - Program size limits are enforced after the execution of the genetic operators, so if if you write a new genetic operator that might increase the length of a program then it should not presume to provide a fitness value (it'll be computed later in replacement-tournament). - This uses Chris McConnell's random.cl random number generator to allow for replication of experiments based on integer seed values. To ensure replicability do not use the built-in Common Lisp RANDOM function -- use only RANDINT (defined below) or other functions from random.cl. Random.cl can be obtained from the CMU Lisp Repository, at http://www.mit.edu/afs/cs.cmu.edu/project/ ai-repository/ai/lang/lisp/code/math/random/ Version History - lgp2, Aug16 1999, added: - operator percentages - demes - immigration/emigration for demes - standardized REPLACEMENT-TOURNAMENT, which now performs a real tournament and replaces the worst in the tournament group. Note that the new regime for REPLACEMENT-TOURNAMENT is not friendly to the interactive fitness evaluation applications, e.g. for evolving drawings, that had been done with the earlier version of lgp. - comments above made consistent with changes - new parameters: - *demes*: the number of populations - *replacement-tournament-size*: the size of the replacement tournament - *fitness-function*: the name of the problem-specific fitness function - new global variables: - *current-deme* - *migrants* - changed parameters: - *genetic-operators* is now a list of (fraction op) pairs; fractions should sum to 1.0 - deleted parameters: - *Percent-Losers-Win* - Sept23 1999, made RANDOM-GENETIC-OPERATOR robust for sets of op fractions that sum to less than one (which could happen via roundoff errors). - Dec06 1999, added BEST-SO-FAR genetic operator. Use this with caution since it circumventes the deme boundaries. |# ;; compiler optimization settings ; for debugging ; (eval-when (compile) ; (declaim (optimize (speed 2) (safety 1) (space 1) (debug 3)))) ; for maximum reasonably safe speed (eval-when (compile) (declaim (optimize (speed 3) (safety 1) (space 0) (debug 0)))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; globals and parameters (defvar *instruction-pointer* 0 "The index of the currently executing instruction. Can be manipulated to implement control structures.") (defvar *populations* nil "A 2-dimensional array of size *demes* X *population-size* containing all individuals in the current populations.") (defvar *gp-params* nil "A list of (param value) which is used to configure the genetic programming system at the start of a run. See the demo at the bottom of lgp.lisp for an example of the use of *gp-params*.") (defparameter *lgp-output-path* "lgp-output" "The path (full directory/filename specification) for lgp output. Lgp sends output both to standard output (usually the terminal/listener) and to this file. Old versions of the output file are *not* deleted -- new output is appended to the contents of the old file. Use a value of nil if you don't want any file output.") (defvar *instruction-generators* nil "A list of forms which are EVALd to produce instruction/arg lists (program elements). List an item multiple times if you want it to have a higher probability of being used. This parameter and its value should be included in *gp-params*.") (defvar *genetic-operators* nil "A list of pairs of the form (fraction operator) where the fractions are numbers between 0 and 1 and the operators are genetic operator functions, each of which should return a new individual for possible insertion in the population. The fractions should sum to 1.0. This parameter and its value should be included in *gp-params*.") (defvar *population-size* 0 "The number of individuals in each population. This parameter and its value should be included in *gp-params*.") (defvar *demes* 1 "The number of populations. This parameter and its value should be included in *gp-params*.") (defvar *current-deme* 0 "The index of the deme being acted upon by the current genetic operator.") (defvar *migrants* nil "The pool of emigres who have not yet immigrated into their new demes.") (defvar *selection-tournament-size* 1 "The number of individuals that are considered when selecting an individual from the population (with random-individual) in genetic operators. This parameter and its value should be included in *gp-params*.") (defvar *replacement-tournament-size* 1 "The number of individuals that are considered when inserting an individual, produced by a genetic operator, back into the population. The new individual replaces the weakest element of the tournament group. This parameter and its value should be included in *gp-params*.") (defvar *best-individual* nil "The best individual seen so far in the current run. This parameter and its value should be included in *gp-params*. You probably want to use some arbitrary terrible individual as the initial value. Remember that individuals are lists of (fitness-list program) and that the fitness-list part of this is itself a list of fitness components (numbers), with smaller numbers being better.") (defvar *fitness-function* nil "The name of the problem-specific fitness function.") (defvar *halting-fitness* nil "A fitness list that is considered good enough to stop the genetic programming system run. This parameter and its value should be included in *gp-params*.") (defvar *report-every* 1 "A positive integer that indicates how frequently the lgp system should print a progress report. Set to the number of cycles (operator applications) that you want between reports. Set to 0 if you don't want any reports before termination. This parameter and its value should be included in *gp-params*.") (defvar *max-initial-program-length* 1 "The maximum length of a newly-generated program. This parameter and its value should be included in *gp-params*.") (defvar *max-program-length* 1 "The maximum length of any lgp-evolved program. This parameter and its value should be included in *gp-params*.") (defvar *random-seeds* (list 0 0) "A list of two integers, the first of which must be 0 <= n <= 31328, and the second of which must be 0 <= n <= 30081. These are used to seed the random number generator. This parameter and its value should be included in *gp-params*.") (defvar *initialization-forms* nil "A list of forms that will be evaluated prior to the start of a genetic programming run. Use this to set up additional globals for fitness evaluation functions, etc. This parameter and its value should be included in *gp-params*.") (defvar *lexicographic-fitness-epsilon* 0 "The 'margin of equivalence' for fitness component comparisons. If two fitness components differ by less than this value then they will be considered equivalent. This parameter and its value should be included in *gp-params*.") ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; utilities (defun randint (n) "Calls the random-integer function from random.cl to produce a random integer between 0 (inclusive) and n (exclusive)." (random::random-integer n)) (defun lgp-output (&rest format-args) "Interprets its arguments as FORMAT does (without a stream argument) and sends the output both to standard output (e.g. the listener) and, if *lgp-output-path* is non-nil, appends it to the file at that path." (apply #'format (cons t format-args)) (when *lgp-output-path* (with-open-file (out *lgp-output-path* :direction :output :if-exists :append :if-does-not-exist :create) (apply #'format (cons out format-args))))) (defun random-element (list) "Returns a random element of the given list." (nth (randint (length list)) list)) (defun initial-segment (list) "Returns a random initial segment of the given list." (butlast list (randint (1+ (length list))))) (defun tail-segment (list) "Returns a random tail segment of the given list." (nthcdr (randint (1+ (length list))) list)) (defun middle-segment (list) "Returns a random middle segment of the given list." (initial-segment (tail-segment list))) (defun report-best (cycle &optional (no-matter-what nil)) "Reports on the current best individual in the lgp run. Uses the provided cycle number and *report-every* to determine if reporting is really warranted, and reports only if these numbers indicate that a report should be made OR if the optional no-matter-what parameter is true. The best individual is printed in a format that allows for easy re-setting of *best-individual*." (labels ((report () (lgp-output "~%;; -*- Report at cycle ~A -*-" cycle) (lgp-output "~%(setq *best-individual* '~A)" *best-individual*))) (if no-matter-what (report) (unless (zerop *report-every*) (when (zerop (mod cycle *report-every*)) (report)))))) (defun set-gp-params () "Sets the lgp parameters as indicated in *gp-params*." (dolist (pair *gp-params*) (set (first pair) (second pair)))) (defun print-gp-params () "Prints a report of the current values of lgp parameters. The parameters are printed in a format that allows for easy re-setting of *gp-params*." (lgp-output "~%(setq *gp-params* '~A)~%" *gp-params*)) (defun random-individual () "Performs a selection tournament on randomly selected individuals and returns the winner. :no-computed-fitness beats any computed fitness value." (let ((candidates nil)) (dotimes (i *selection-tournament-size*) (push (aref *populations* *current-deme* (randint *population-size*)) candidates)) (let ((unevaluated (find :no-computed-fitness candidates :key #'first))) (if unevaluated unevaluated (let ((sorted-candidates (sort candidates #'lexicographically-better-fitness :key #'first))) (first sorted-candidates)))))) (defun compute-fitness-if-necessary (individual) (if (eq (first individual) :no-computed-fitness) (funcall *fitness-function* (second individual)) (first individual))) (defun replacement-tournament (new-individual) "Inserts the new individual in the population in place of the weakest member of the replacement tournament group. Unevaluated individuals are sorted randomly. Also sets *best-individual*." (setq new-individual (list (compute-fitness-if-necessary new-individual) (second new-individual))) ;; maintain *best-individual* (when (lexicographically-better-fitness (first new-individual) (first *best-individual*)) (setq *best-individual* new-individual)) ;; pick the tournament set (let ((tourn-indices nil)) (dotimes (i *replacement-tournament-size*) (push (randint *population-size*) tourn-indices)) ;; sort the indices by fitness of the indexed individuals, worst first (setq tourn-indices (sort tourn-indices #'(lambda (i1 i2) (let ((f1 (first (aref *populations* *current-deme* i1))) (f2 (first (aref *populations* *current-deme* i2)))) (if (or (eq f1 :no-computed-fitness) (eq f2 :no-computed-fitness)) (random-element '(t nil)) (lexicographically-better-fitness f2 f1)))))) ;; install the new individual in place of the worst (setf (aref *populations* *current-deme* (first tourn-indices)) new-individual))) (defun random-genetic-operator () "Retunrs a random genetic operator consistent with the fractions listed in *genetic-operators*." (let ((number (random::RANDOM-FLOAT 1.0)) (accumulator 0)) (do ((ops *genetic-operators* (let ((the-cdr (cdr ops))) (if (null the-cdr) ops the-cdr)))) ((progn (incf accumulator (first (first ops))) (>= accumulator number)) (second (first ops)))))) (defun lexicographically-better-fitness (fitness-list1 fitness-list2) "Returns t if fitness-list1 is better than fitness-list2 with the fitness components considered lexicographically. Assumes all fitness-lists are the same length." (cond ((null fitness-list1) nil) ((<= (abs (- (first fitness-list1) (first fitness-list2))) *lexicographic-fitness-epsilon*) (lexicographically-better-fitness (rest fitness-list1) (rest fitness-list2))) ((< (first fitness-list1) (first fitness-list2)) t) (t nil))) (defun noop () "Does nothing. Used for substitutions in MINIMIZATION genetic operators." nil) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; top level and associated functions (defun evolve () "Top-level function for lgp2 -- call it to start an lgp2 run." ;; set and print the paramters as specified in *gp-params* (set-gp-params) (print-gp-params) ;; execute any initialization forms and seed the random number generator (dolist (f *initialization-forms*) (eval f)) (random::seed-state (first *random-seeds*) (second *random-seeds*)) ;; generate the initial populations (generate-initial-populations) (setq *migrants* nil) ;; begin the operator-application loop (do ((cycle 0 (+ cycle 1))) ((not (lexicographically-better-fitness ;; check for halting condition *halting-fitness* (first *best-individual*))) (report-best cycle t) ;; always report before halting *Best-Individual*) ;; return the best individual ;; set the current deme (randomly) (setq *current-deme* (randint *demes*)) ;; apply an operator (let ((new-individual (keep-within-size-limit (funcall (random-genetic-operator))))) ;; possibly insert the new individual into the population (when new-individual (replacement-tournament new-individual)) (report-best cycle)))) ;; report if it's time to do so (defun execute-program (program) "Executes the given lgp program. Note that arguments are not evaluated prior to calling the functions in the program." (let* ((num-instructions (length program)) (program-array (make-array num-instructions :initial-contents program))) ;; we use an explicit instruction pointer so that control structures ;; can be implemented later (setq *instruction-pointer* 0) (loop (when (>= *instruction-pointer* num-instructions) (return)) ;; ternimate when instruction pointer out of range (let ((instruction (aref program-array *instruction-pointer*))) (apply (first instruction) (rest instruction))) (incf *instruction-pointer*)))) (defun generate-random-program () "Creates and returns a random lgp program according to *max-initial-program-length* and *max-initial-program-length*." (let ((program nil)) (dotimes (which-instruction (randint *max-initial-program-length*)) (push (eval (random-element *instruction-generators*)) program)) program)) (defun generate-initial-populations () "Initializes the lgp populations and fills them with new random individuals." (setq *populations* (make-array (list *demes* *population-size*))) (dotimes (which-deme *demes*) (dotimes (which-individual *population-size*) (setf (aref *populations* which-deme which-individual) (list :no-computed-fitness (generate-random-program)))))) (defun keep-within-size-limit (individual) "Returns a version individual that obeys the *max-program-length*, obtained via truncation if necessary." (when individual (let* ((program (second individual)) (program-length (length program))) (if (> program-length *max-program-length*) (list :no-computed-fitness (butlast program (- program-length *max-program-length*))) individual)))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; genetic operators (defun reproduction () "Returns a random individual from the population, produced via selection tournament." (random-individual)) (defun crossover () "Returns a new individual constructed from a random initial segment of one existing individual appended to a random tail segment from a possibly different existing individual." (let ((parent1 (random-individual)) (parent2 (random-individual))) (list :no-computed-fitness (append (initial-segment (second parent1)) (tail-segment (second parent2)))))) (defun mutation () "Returns a new individual constructed by substituting one randomly generated instruction for one instruction in an existing individual." (let* ((parent-program (copy-tree (second (random-individual)))) (instruction-index (randint (length parent-program))) (new-instruction (eval (random-element *instruction-generators*)))) (unless (null parent-program) (setf (nth instruction-index parent-program) new-instruction)) (list :no-computed-fitness parent-program))) (defun insertion () "Returns a new individual constructed by sandwiching a random middle segment of one individual between random initial and tail segments of a possibly different individual. Note that the initial and tail segments might overlap or fail to include all instructions from the 'outer' individual." (let* ((program1 (second (random-individual))) (prog1-length (length program1)) (program2 (second (random-individual)))) (list :no-computed-fitness (let ((split-point (randint (+ 1 prog1-length)))) (append (butlast program1 (- prog1-length split-point)) (middle-segment program2) (nthcdr split-point program1)))))) (defun mutant-insertion () "Returns a new individual constructed by inserting a new random program within the program of an existing individual. Unlike the case with the INSERTION operator the 'outer' individual will be included in its entirety, with no possibility of duplicated parts." (let* ((program (second (random-individual))) (program-length (length program))) (list :no-computed-fitness (let ((split-point (randint (+ 1 (length program))))) (append (butlast program (- program-length split-point)) (generate-random-program) (nthcdr split-point program)))))) (defun deletion () "Returns a new individual constructed by removing some middle segment of an existing individual." (let* ((parent (random-individual)) (parent-prog (second parent)) (first-part (initial-segment parent-prog)) (second-part (tail-segment (nthcdr (length first-part) parent-prog)))) (list :no-computed-fitness (append first-part second-part)))) (defun minimization () "Returns a new individual constructed by removing gates from an existing individual. Each gate is examined for removal, starting from the beginning, and the removal is accepted if it results in an equal or higher lexicographic fitness. NOTE THAT THIS IS VERY EXPENSIVE!" (let* ((individual (random-individual)) (program (second individual)) (program-length (length program)) (fitness (funcall *fitness-function* program))) (dotimes (instr-index program-length) (let ((new-program (copy-tree program))) (setf (nth instr-index new-program) (list 'noop)) (let ((new-fitness (funcall *fitness-function* new-program))) (unless (lexicographically-better-fitness fitness new-fitness) (setq program new-program fitness new-fitness))))) (list :no-computed-fitness (remove (list 'noop) program :test #'equalp)))) (defun pair-minimization () "Like minimization but examines every *pair* of instructions for possible removal. This is even more expensive -- n^2 in the number of instructions in the chosen program." (let* ((individual (random-individual)) (program (second individual)) (program-length (length program)) (fitness (funcall *fitness-function* program))) (dotimes (instr-index-1 program-length) (dotimes (instr-index-2 program-length) (when (not (= instr-index-1 instr-index-2)) (let ((new-program (copy-tree program))) (setf (nth instr-index-1 new-program) (list 'noop)) (setf (nth instr-index-2 new-program) (list 'noop)) (let ((new-fitness (funcall *fitness-function* new-program))) (unless (lexicographically-better-fitness fitness new-fitness) (setq program new-program fitness new-fitness))))))) (list :no-computed-fitness (remove (list 'noop) program :test #'equalp)))) (defun emigration () "Copies an individual from the current deme to *migrants* for later immigration. Returns nil." (push (copy-tree (random-individual)) *migrants*) nil) (defun immigration () "Attempts to return an individual from *migrants*, which is deleted from *migrants* upon success; if none is available this returns nil." (unless (null *migrants*) (pop *migrants*))) (defun best-so-far () "Returns a copy of the best individual seen so far (*Best-Individual*). Use this with caution since it circumventes the deme boundaries." *Best-Individual*) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; demo #| This is a fairly silly demo, included just to demonstrate lgp's features. The basic idea derives from the game of blackjack, in which one is dealt cards one at a time in an attempt to get a total of 21. In real blackjack you want to stay under 21, so 20 is pretty good but 22 is useless. Here we just want to get exactly the target value, and we'll use a higher value just to make it a little harder. And in the real game the cards are dealt one at a time and the player's task is to bet and decide when to stop requesting new cards (which are requested by saying "hit me"). Here, we are instead workin on the rather different task of trying to evolve sequences of cards that add up to the target value, and there's no dealing or stopping involved. So the analogy is a bit twisted. |# ;; First we define some variables and utilites for the demo application. (defvar *hand* 0 "Holds the total of the cards that have been dealt so far.") (defparameter *desired-total* 121 "The number that we want *hand* to contain after the cards are dealt.") (defun hit-me (n) "Adds n to *hand*." (incf *hand* n)) (defun deal-a-card () "Returns a form which, when evaluated, deals a random card." (list 'hit-me (+ 1 (randint 13)))) (defun demo-fitness (program) "Evaluates the fitness of a program for the demo. The first fitness component is the 'error' in the total value, and the second is the length of the program." (list (progn (setq *hand* 0) (execute-program program) (abs (- *hand* *desired-total*))) (length program))) ;; Now we give *gp-params* an appropriate value for the demo: (setq *gp-params* '((*instruction-generators* ((deal-a-card))) (*genetic-operators* ((1/20 reproduction) (1/10 crossover) (1/10 mutation) (1/10 insertion) (1/10 mutant-insertion) (1/10 deletion) (1/10 minimization) (1/10 pair-minimization) (1/20 best-so-far) (1/10 immigration) (1/10 emigration))) (*population-size* 200) (*demes* 5) (*selection-tournament-size* 5) (*replacement-tournament-size* 5) (*best-individual* ((999999999 999999999) nil)) (*fitness-function* demo-fitness) (*halting-fitness* (0 13)) (*report-every* 1000) (*max-initial-program-length* 10) (*max-program-length* 50) (*random-seeds* (0 0)) (*lexicographic-fitness-epsilon* 0) (*initialization-forms* nil) )) ;; Evaluate (evolve) to run the demo.