skip to main content
10.1145/1569901.1570087acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Data-intensive computing for competent genetic algorithms: a pilot study using meandre

Published:08 July 2009Publication History

ABSTRACT

Data-intensive computing has positioned itself as a valuable programming paradigm to efficiently approach problems requiring processing very large volumes of data. This paper presents a pilot study about how to apply the data-intensive computing paradigm to evolutionary computation algorithms. Two representative cases (selectorecombinative genetic algorithms and estimation of distribution algorithms) are presented, analyzed, and discussed. This study shows that equivalent data-intensive computing evolutionary computation algorithms can be easily developed, providing robust and scalable algorithms for the multicore-computing era. Experimental results show how such algorithms scale with the number of available cores without further modification.

References

  1. Alba, E., Ed. Parallel Metaheuristics. Wiley, 2007.Google ScholarGoogle Scholar
  2. Amdahl, G. Validity of the single processor approach to achieving large-scale computing capabilities. In AFIPS Conference Proceedings (1967), pp. 483--485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Beckett, D. RDF/XM Syntax Specification (Revised). W3C Recommendation 10 February 2004, The World Wide Web Consortium, 2004.Google ScholarGoogle Scholar
  4. Beynon, M.D., Kurc, T., Sussman, A., and Saltz, J. Design of a framework for data-intensive wide-area applications. In HCW '00: Proceedings of the 9th Heterogeneous Computing Workshop (Washington, DC, USA, 2000), IEEE Computer Society, p. 116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brickley, D., and Guha, R. RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation 10 February 2004, The World Wide Web Consortium, 2004.Google ScholarGoogle Scholar
  6. Cantú-Paz, E. Efficient and Accurate Parallel Genetic Algorithms. Springer, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. De Jong, K., and Sarma, J. On decentralizing selection algorithms. In Proceedings of the Sixth International Conference on Genetic Algorithms (1995), Morgan Kaufmann, pp. 17--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dean, J., and Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. In OSDI'04: Sixth Symposium on Operating System Design and Implementation (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Foster, I. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Foster, I. The virtual data grid: A new model and architecture for data-intensive collaboration. In in the 15th International Conference on Scientific and Statistical Database Management (2003), pp. 11--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Giacobini, M., Tomassini, M., and Tettamanzi, A. Takeover time curves in random and small-world structured populations. In GECCO '05: Proceedings of the 2005 conference on Genetic and evolutionary computation (New York, NY, USA, 2005), ACM, pp. 1333--1340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Goldberg, D.E. Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading, MA, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Goldberg, D.E. The Design of Innovation: Lessons from and for Competent Genetic Algorithms. Kluwer Academic Publishers, Norwell, MA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Goldberg, D.E., Deb, K., and Clark, J.H. Genetic algorithms, noise, and the sizing of populations. Complex Systems 6 (1992), 333--362.Google ScholarGoogle Scholar
  15. Goldberg, D.E., Korb, B., and Deb, K. Messy genetic algorithms: Motivation, analysis, and first results. Complex Systems 3, 5 (1989), 493--530.Google ScholarGoogle Scholar
  16. Harik, G.R., Lobo, F.G., and Sastry, K. Linkage learning via probabilistic modeling in the ECGA. In Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications, M. Pelikan, K. Sastry, and E. Cantú-Paz, Eds. Springer, Berlin, in press, ch. 3.Google ScholarGoogle Scholar
  17. Larrañaga, P., and Lozano, J.A., Eds. Estimation of Distribution Algorithms. Kluwer Academic Publishers, Boston, MA, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  18. Llorà, X. E2K: evolution to knowledge. SIGEVOlution 1, 3 (2006), 10--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Llorà, X. Genetic Based Machine Learning using Fine-grained Parallelism for Data Mining. PhD thesis, Enginyeria i Arquitectura La Salle. Ramon Llull University, Barcelona, February, 2002.Google ScholarGoogle Scholar
  20. Llorà, X., Ács, B., Auvil, L., Capitanu, B., Welge, M., and Goldberg, D. E. Meandre: Semantic-driven data-intensive flows in the clouds. In Proceedings of the 4th IEEE International Conference on e-Science (2008), IEEE press, pp. 238--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mattmann, C.A., Crichton, D.J., Medvidovic, N., and Hughes, S. A software architecture-based framework for highly distributed and data intensive scientific applications. In ICSE '06: Proceedings of the 28th international conference on Software engineering (New York, NY, USA, 2006), ACM, pp. 721--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Morrison, J.P. Flow-Based Programming: A New Approach to Application Development. Van Nostrand Reinhold, 1994.Google ScholarGoogle Scholar
  23. Pelikan, M., Goldberg, D.E., and Cantú-Paz, E. Linkage learning, estimation distribution, and Bayesian networks. Evolutionary Computation 8, 3 (2000), 314--341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Pelikan, M., Lobo, F., and Goldberg, D.E. A survey of optimization by building and using probabilistic models. Computational Optimization and Applications 21 (2002), 5--20. (Also IlliGAL Report No. 99018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sarma, J., and De Jong, K. An analysis of local selection algorithms in a spatially structured evolutionary algorithm. In Proceedings of the Seventh International Conference on Genetic Algorithms (1997), Morgan Kaufmann, pp. 181--186.Google ScholarGoogle Scholar
  26. Sarma, J., and De Jong, K. Selection pressure and performance in spatially distributed evolutionary algorithms. In Proceedings of the World Congress on Computatinal Intelligence (1998), IEEE Press, pp. 553--557.Google ScholarGoogle ScholarCross RefCross Ref
  27. Sastry, K., and Goldberg, D.E. Designing competent mutation operators via probabilistic model building of neighborhoods. Proceedings of the Genetic and Evolutionary Computation Conference 2 (2004), 114--125.Google ScholarGoogle ScholarCross RefCross Ref
  28. Sywerda, G. Uniform crossover in genetic algorithms. In Proceedings of the third international conference on Genetic algorithms (San Francisco, CA, USA, 1989), Morgan Kaufmann Publishers Inc., pp. 2--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Uysal, M., Kurc, T.M., Sussman, A., and Saltz, J. A performance prediction framework for data intensive applications on large scale parallel machines. In 4th Wkshp. on Languages, Compilers and Run-time Systems for Scalable Computers, Lecture Notes in Computer Science No 1511 (1998), Springer-Verlag, pp. 243--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Weibel, S., Kunze, J., Lagoze, C., and Wolf, M. Dublin Core Metadata for Resource Discovery. Tech. Rep. RFC2413, The Dublin Core Metadata Initiative, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Welge, M., Auvil, L., Shirk, A., Bushell, C., Bajcsy, P., Cai, D., Redman, T., Clutter, D., Aydt, R., and Tcheng, D. Data to Knowledge (D2K). Tech. rep., Technical Report Automated Learning Group, National Center for Supercomputing Applications, UIUC, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Data-intensive computing for competent genetic algorithms: a pilot study using meandre

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation
        July 2009
        2036 pages
        ISBN:9781605583259
        DOI:10.1145/1569901

        Copyright © 2009 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 July 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader