skip to main content
10.1145/1389095.1389341acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

The impact of population size on code growth in GP: analysis and empirical validation

Published: 12 July 2008 Publication History

Abstract

The crossover bias theory for bloat [18] is a recent result which predicts that bloat is caused by the sampling of short, unfit programs. This theory is clear and simple, but it has some weaknesses: (1) it implicitly assumes that the population is large enough to allow sampling of all relevant program sizes (although it does explain what to expect in the many practical cases where this is not true, e.g., because the population is small); (2) it does not explain what is meant by its assumption that short programs are unfit.
In this paper we discuss these weaknesses and propose a refined version of the crossover bias theory that clarifies the relationship between bloat and finite populations, and explains what features of the fitness landscape cause bloat to occur. The theory, in particular, predicts that smaller populations will bloat more slowly than larger ones. Additionally, the theory predicts that bloat will only be observed in problems where short programs are less fit than longer ones when looking at samples created by fitness-based importance sampling, i.e. samplings of the search space in which fitter programs have a higher probability of being sampled (e.g., the Metropolis-Hastings method). Experiments with two classical GP benchmarks fully corroborate the theory.

References

[1]
T. Blickle and L. Thiele. Genetic programming and redundancy. In J. Hopf, editor, Genetic Algorithms within the Framework of Evolutionary Computation (Workshop at KI-94, Saarbrcken), pages 33--38, Saarbrcken, Germany, 1994. Max-Planck-Institut fr Informatik (MPI-I-94-241).
[2]
S. Chib and E. Greenberg. Understanding the metropolis-hastings algorithm. The American Statistician, 49(4):327--335, Nov. 1995.
[3]
S. Dignum and R. Poli. Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In D. Thierens, et al., editors, GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, volume 2, pages 1588--1595, London, 7-11 July 2007. ACM Press.
[4]
J. R. Koza. A genetic approach to the truck backer upper problem and the inter-twined spiral problem. In Proceedings of IJCNN International Joint Conference on Neural Networks, volume IV, pages 310--318. IEEE Press, 1992.
[5]
J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992.
[6]
W. B. Langdon. Convergence rates for the distribution of program outputs. In W. B. Langdon, et al., editors, GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 812--819, New York, 9-13 July 2002. Morgan Kaufmann Publishers.
[7]
W. B. Langdon. How many good programs are there? How long are they? In K. A. De Jong, et al., editors, Foundations of Genetic Algorithms VII, pages 183--202, Torremolinos, Spain, 4-6 Sept. 2002. Morgan Kaufmann. Published 2003.
[8]
W. B. Langdon. Convergence of program fitness landscapes. In E. Cantú-Paz, et al., editors, Genetic and Evolutionary Computation -- GECCO-2003, volume 2724 of LNCS, pages 1702--1714, Chicago, 12-16 July 2003. Springer-Verlag.
[9]
W. B. Langdon. The distribution of reversible functions is Normal. In R. L. Riolo and B. Worzel, editors, Genetic Programming Theory and Practise, chapter 11, pages 173--188. Kluwer, 2003.
[10]
W. B. Langdon. The distribution of amorphous computer outputs. In S. Stepney and S. Emmott, editors, The Grand Challenge in Non-Classical Computation: International Workshop, York, UK, 18-19 Apr. 2005.
[11]
W. B. Langdon and R. Poli. Foundations of Genetic Programming. Springer-Verlag, 2002.
[12]
W. B. Langdon and R. Poli. The halting probability in von Neumann architectures. In P. Collet, et al., editors, Proceedings of the 9th European Conference on Genetic Programming, volume 3905 of Lecture Notes in Computer Science, pages 225--237, Budapest, Hungary, 10 - 12 Apr. 2006. Springer.
[13]
N. F. McPhee and R. Poli. A schema theory analysis of the evolution of size in genetic programming with linear representations. In J. Miller, et al., editors, Proceedings of the Fourth European Conference on Genetic Programming (EuroGP-2001), volume 2038 of LNCS, pages 108--125, Lake Como, Italy, 2001. Springer, Berlin, Heidelberg, New York. Lecture notes in Computer Science vol. 2038.
[14]
P. Nordin and W. Banzhaf. Complexity compression and evolution. In L. J. Eshelman, editor, Proceedings of the Sixth International Conference on Genetic Algorithms, pages 310--317, Pittsburgh, PA, USA, 1995. Morgan Kaufmann.
[15]
A. Piszcz and T. Soule. Dynamics of evolutionary robustness. In M. Keijzer et al., editor, GECCO 2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, volume 1, pages 871--878, Seattle, Washington, USA, 8-12 July 2006. ACM Press.
[16]
R. Poli. General schema theory for genetic programming with subtree-swapping crossover. In J. F. Miller, et al., editors, Genetic Programming, Proceedings of EuroGP 2001, LNCS, pages 143--159, Milan, 18-20 Apr. 2001. Springer-Verlag.
[17]
R. Poli and W. B. Langdon. Efficient markov chain model of machine code program execution and halting. In R. L. Riolo, et al., editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 13. Springer, Ann Arbor, 11-13 May 2006.
[18]
R. Poli, W. B. Langdon, and S. Dignum. On the limiting distribution of program sizes in tree-based genetic programming. In M. Ebner, et al., editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, pages 193--204, Valencia, Spain, 11 - 13 Apr. 2007. Springer.
[19]
R. Poli, W. B. Langdon, and N. F. McPhee. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza).
[20]
R. Poli and N. F. McPhee. General schema theory for genetic programming with subtree-swapping crossover: Part II. Evolutionary Computation, 11(2):169--206, June 2003.
[21]
S. Silva and J. Almeida. Dynamic maximum tree depth. In E. Cantú-Paz, et al., editors, Genetic and Evolutionary Computation -- GECCO-2003, volume 2724 of LNCS, pages 1776--1787, Chicago, 12-16 July 2003. Springer-Verlag.
[22]
T. Soule and J. A. Foster. Effects of code growth and parsimony pressure on populations in genetic programming. Evolutionary Computation, 6(4):293--309, 1998.
[23]
L. Vanneschi. Theory and Practice for Efficient Genetic Programming. Ph.D. thesis, Faculty of Sciences, University of Lausanne, Switzerland, 2004.
[24]
L. Vanneschi, Y. Pirola, P. Collard, M. Tomassini, S. Verel, and G. Mauri. A quantitative study of neutrality in GP boolean landscapes. In M. Keijzer et al., editor, Proceedings of the Genetic and Evolutionary Computation Conference, GECCO'06, volume 1, pages 895--902. ACM Press, 2006.

Cited By

View all
  • (2019)Multi-objective genetic programming with partial sampling and its extension to many-objectiveSN Applied Sciences10.1007/s42452-019-0208-y1:3Online publication date: 4-Feb-2019
  • (2018)Partial Sampling Operator and Structural Distance Ranking for Multi-Objective GP2018 International Conference on Computer and Applications (ICCA)10.1109/COMAPP.2018.8460386(303-308)Online publication date: Aug-2018
  • (2018)A Multiple Expression Alignment Framework for Genetic ProgrammingGenetic Programming10.1007/978-3-319-77553-1_11(166-183)Online publication date: 2-Mar-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation
July 2008
1814 pages
ISBN:9781605581309
DOI:10.1145/1389095
  • Conference Chair:
  • Conor Ryan,
  • Editor:
  • Maarten Keijzer
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bloat
  2. genetic programming
  3. population size

Qualifiers

  • Research-article

Conference

GECCO08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Multi-objective genetic programming with partial sampling and its extension to many-objectiveSN Applied Sciences10.1007/s42452-019-0208-y1:3Online publication date: 4-Feb-2019
  • (2018)Partial Sampling Operator and Structural Distance Ranking for Multi-Objective GP2018 International Conference on Computer and Applications (ICCA)10.1109/COMAPP.2018.8460386(303-308)Online publication date: Aug-2018
  • (2018)A Multiple Expression Alignment Framework for Genetic ProgrammingGenetic Programming10.1007/978-3-319-77553-1_11(166-183)Online publication date: 2-Mar-2018
  • (2017)The influence of population size in geometric semantic GPSwarm and Evolutionary Computation10.1016/j.swevo.2016.05.00432(110-120)Online publication date: Feb-2017
  • (2016)Parameter evaluation of geometric semantic genetic programming in pharmacokineticsInternational Journal of Bio-Inspired Computation10.1504/IJBIC.2016.0746348:1(42-50)Online publication date: 1-Feb-2016
  • (2015)Improving Maritime Awareness with Semantic Genetic Programming and Linear Scaling: Prediction of Vessels Position Based on AIS DataApplications of Evolutionary Computation10.1007/978-3-319-16549-3_59(732-744)Online publication date: 17-Mar-2015
  • (2012)Operator equalisation for bloat free genetic programming and a survey of bloat control methodsGenetic Programming and Evolvable Machines10.1007/s10710-011-9150-513:2(197-238)Online publication date: 1-Jun-2012
  • (2011)Reassembling operator equalisationACM SIGEVOlution10.1145/2043118.20431205:3(10-22)Online publication date: 1-Sep-2011
  • (2011)Reassembling operator equalisationProceedings of the 13th annual conference on Genetic and evolutionary computation10.1145/2001576.2001764(1395-1402)Online publication date: 12-Jul-2011
  • (2011)Genetic programming with one-point crossover and subtree mutation for effective problem solving and bloat controlSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-010-0687-715:8(1551-1567)Online publication date: 1-Aug-2011
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media