ABSTRACT
Data-intensive computing has positioned itself as a valuable programming paradigm to efficiently approach problems requiring processing very large volumes of data. This paper presents a pilot study about how to apply the data-intensive computing paradigm to evolutionary computation algorithms. Two representative cases (selectorecombinative genetic algorithms and estimation of distribution algorithms) are presented, analyzed, and discussed. This study shows that equivalent data-intensive computing evolutionary computation algorithms can be easily developed, providing robust and scalable algorithms for the multicore-computing era. Experimental results show how such algorithms scale with the number of available cores without further modification.
- Alba, E., Ed. Parallel Metaheuristics. Wiley, 2007.Google Scholar
- Amdahl, G. Validity of the single processor approach to achieving large-scale computing capabilities. In AFIPS Conference Proceedings (1967), pp. 483--485. Google ScholarDigital Library
- Beckett, D. RDF/XM Syntax Specification (Revised). W3C Recommendation 10 February 2004, The World Wide Web Consortium, 2004.Google Scholar
- Beynon, M.D., Kurc, T., Sussman, A., and Saltz, J. Design of a framework for data-intensive wide-area applications. In HCW '00: Proceedings of the 9th Heterogeneous Computing Workshop (Washington, DC, USA, 2000), IEEE Computer Society, p. 116. Google ScholarDigital Library
- Brickley, D., and Guha, R. RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation 10 February 2004, The World Wide Web Consortium, 2004.Google Scholar
- Cantú-Paz, E. Efficient and Accurate Parallel Genetic Algorithms. Springer, 2000. Google ScholarDigital Library
- De Jong, K., and Sarma, J. On decentralizing selection algorithms. In Proceedings of the Sixth International Conference on Genetic Algorithms (1995), Morgan Kaufmann, pp. 17--23. Google ScholarDigital Library
- Dean, J., and Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. In OSDI'04: Sixth Symposium on Operating System Design and Implementation (2004). Google ScholarDigital Library
- Foster, I. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison Wesley, 1995. Google ScholarDigital Library
- Foster, I. The virtual data grid: A new model and architecture for data-intensive collaboration. In in the 15th International Conference on Scientific and Statistical Database Management (2003), pp. 11--. Google ScholarDigital Library
- Giacobini, M., Tomassini, M., and Tettamanzi, A. Takeover time curves in random and small-world structured populations. In GECCO '05: Proceedings of the 2005 conference on Genetic and evolutionary computation (New York, NY, USA, 2005), ACM, pp. 1333--1340. Google ScholarDigital Library
- Goldberg, D.E. Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading, MA, 1989. Google ScholarDigital Library
- Goldberg, D.E. The Design of Innovation: Lessons from and for Competent Genetic Algorithms. Kluwer Academic Publishers, Norwell, MA, 2002. Google ScholarDigital Library
- Goldberg, D.E., Deb, K., and Clark, J.H. Genetic algorithms, noise, and the sizing of populations. Complex Systems 6 (1992), 333--362.Google Scholar
- Goldberg, D.E., Korb, B., and Deb, K. Messy genetic algorithms: Motivation, analysis, and first results. Complex Systems 3, 5 (1989), 493--530.Google Scholar
- Harik, G.R., Lobo, F.G., and Sastry, K. Linkage learning via probabilistic modeling in the ECGA. In Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications, M. Pelikan, K. Sastry, and E. Cantú-Paz, Eds. Springer, Berlin, in press, ch. 3.Google Scholar
- Larrañaga, P., and Lozano, J.A., Eds. Estimation of Distribution Algorithms. Kluwer Academic Publishers, Boston, MA, 2002.Google ScholarCross Ref
- Llorà, X. E2K: evolution to knowledge. SIGEVOlution 1, 3 (2006), 10--17. Google ScholarDigital Library
- Llorà, X. Genetic Based Machine Learning using Fine-grained Parallelism for Data Mining. PhD thesis, Enginyeria i Arquitectura La Salle. Ramon Llull University, Barcelona, February, 2002.Google Scholar
- Llorà, X., Ács, B., Auvil, L., Capitanu, B., Welge, M., and Goldberg, D. E. Meandre: Semantic-driven data-intensive flows in the clouds. In Proceedings of the 4th IEEE International Conference on e-Science (2008), IEEE press, pp. 238--245. Google ScholarDigital Library
- Mattmann, C.A., Crichton, D.J., Medvidovic, N., and Hughes, S. A software architecture-based framework for highly distributed and data intensive scientific applications. In ICSE '06: Proceedings of the 28th international conference on Software engineering (New York, NY, USA, 2006), ACM, pp. 721--730. Google ScholarDigital Library
- Morrison, J.P. Flow-Based Programming: A New Approach to Application Development. Van Nostrand Reinhold, 1994.Google Scholar
- Pelikan, M., Goldberg, D.E., and Cantú-Paz, E. Linkage learning, estimation distribution, and Bayesian networks. Evolutionary Computation 8, 3 (2000), 314--341. Google ScholarDigital Library
- Pelikan, M., Lobo, F., and Goldberg, D.E. A survey of optimization by building and using probabilistic models. Computational Optimization and Applications 21 (2002), 5--20. (Also IlliGAL Report No. 99018). Google ScholarDigital Library
- Sarma, J., and De Jong, K. An analysis of local selection algorithms in a spatially structured evolutionary algorithm. In Proceedings of the Seventh International Conference on Genetic Algorithms (1997), Morgan Kaufmann, pp. 181--186.Google Scholar
- Sarma, J., and De Jong, K. Selection pressure and performance in spatially distributed evolutionary algorithms. In Proceedings of the World Congress on Computatinal Intelligence (1998), IEEE Press, pp. 553--557.Google ScholarCross Ref
- Sastry, K., and Goldberg, D.E. Designing competent mutation operators via probabilistic model building of neighborhoods. Proceedings of the Genetic and Evolutionary Computation Conference 2 (2004), 114--125.Google ScholarCross Ref
- Sywerda, G. Uniform crossover in genetic algorithms. In Proceedings of the third international conference on Genetic algorithms (San Francisco, CA, USA, 1989), Morgan Kaufmann Publishers Inc., pp. 2--9. Google ScholarDigital Library
- Uysal, M., Kurc, T.M., Sussman, A., and Saltz, J. A performance prediction framework for data intensive applications on large scale parallel machines. In 4th Wkshp. on Languages, Compilers and Run-time Systems for Scalable Computers, Lecture Notes in Computer Science No 1511 (1998), Springer-Verlag, pp. 243--258. Google ScholarDigital Library
- Weibel, S., Kunze, J., Lagoze, C., and Wolf, M. Dublin Core Metadata for Resource Discovery. Tech. Rep. RFC2413, The Dublin Core Metadata Initiative, 2008. Google ScholarDigital Library
- Welge, M., Auvil, L., Shirk, A., Bushell, C., Bajcsy, P., Cai, D., Redman, T., Clutter, D., Aydt, R., and Tcheng, D. Data to Knowledge (D2K). Tech. rep., Technical Report Automated Learning Group, National Center for Supercomputing Applications, UIUC, 2003.Google Scholar
Index Terms
- Data-intensive computing for competent genetic algorithms: a pilot study using meandre
Recommendations
Neural network crossover in genetic algorithms using genetic programming
AbstractThe use of genetic algorithms (GAs) to evolve neural network (NN) weights has risen in popularity in recent years, particularly when used together with gradient descent as a mutation operator. However, crossover operators are often omitted from ...
Using messy genetic algorithms for solving the winner determination problem
GECCO '10: Proceedings of the 12th annual conference companion on Genetic and evolutionary computationThe paper presents a study on the application of messy genetic algorithms for the winner determination problem in the combinatorial auction realm. Messy genetic algorithms operate explicitly on building blocks in order to obtain good solutions. Since ...
Genetic algorithms for task scheduling problem
The scheduling and mapping of the precedence-constrained task graph to processors is considered to be the most crucial NP-complete problem in parallel and distributed computing systems. Several genetic algorithms have been developed to solve this ...
Comments