skip to main content
10.1145/2831425.2831429acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

REU site: bio-grid initiatives for interdisciplinary research and education

Published:15 November 2015Publication History

ABSTRACT

The Bio-Grid REU (Research Experience for Undergraduates) Site offers undergraduate students to participate in the research activities associated with the Bio-Grid Initiatives conducted at UConn. The initiatives aim at advancing the application of modern computing infrastructures and information technology to research and practice in various life-science disciplines. Training seminars are designed to equip students with preliminary background knowledge such as basic parallel programming skills, large-scale data analytics, and middleware support, etc., as well as some ongoing life-science research projects using these computing methods. Students participate in research activities associated with several collaborative projects supported by a campus-wide computational and data grid. The Site was supported by the national Science Foundation from 08-10 and 12-14.

The REU project introduces such interdisciplinary research work to students in the early stage of their academic career to spark their interest. The project aims at preparing future software engineers to formalize and solve emerging life-science problems, as well as life-science researchers with a strong background in high-performance computing.

References

  1. A. Apostolico and G. Bejerano. Optimal Amnesic Probabilistic Automata or How to Learn and Classify Proteins in Linear Time and Space. In Proceedings of Fourth International Conference on Computational Molecular Biology (RECOMB), pages 25--32, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Balla, V. Thapar, S. Verma, T. Luong, T. Faghri, C.-H. Huang, S. Rajasekaran, J. del Campo, J. Shinn, W. Mohler, M. Maciejewski, M. Gryk, B. Piccirillo, S. Schiller, and M. Schiller. Minimotif Miner: A New Tool for Investigating Protein Function. Nature Methods, 3(3):1--3, 2005.Google ScholarGoogle Scholar
  3. A. Bateman et al. The Pfam protein families database. Nucleic Acids Res., 30:276--280, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  4. G. Bejerano and G. Yona. Modeling Protein Families Using Probabilistic Suffix Trees. In Proceedings of Third International Conference on Computational Molecular Biology (RECOMB), pages 15--24, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Berman, G. Fox, and T. Hey. Grid Computing: Making the Global Infrastructure a Reality. John Wiley & Sons, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Birney. Hidden Markov Models in Biological Sequence Analysis. In IBM J. RES. & DEV 45(3/4), pages 449--454, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Butler, D. Engert, I. Foster, C. Kesselman, S. Tuecke, J. Volmer, and V. Welch. A National-Scale Authentication Infrastructure. IEEE Transactions on Computer, 33(12):60--66, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Collins, J. Montagnat, A. Zijdenbos, and A. Evans. Automated Estimation of Brain Volume in Multiple Sclerosis with BICCR. Information Processing in Medical Imaging, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Comi, M. Philippi, V. Martinelli, G. Sirabian, A. Visciani, A. Cambi, S. Mammi, M. Rovaris, and M. Canal. Brain Magnetic Resonance Imaging Correlates of Cognitive Impairment in Multiple Sclerosis. Journal of Neurological Science, 115:66--73, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. P. Evett, J. A. Hendler, and L. Spector. Parallel Knowledge Representation on the Connection Machine. Journal of Parallel and Distributed Computing, 22:168--184, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Foster. The Grid: A New Infrastructure for 21st Century. Physics Today, 55(2):42--47, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  12. I. Foster and C. Kesselman. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. L. Green and R. Miller. Molecular Structure Determination on a Computational and Data Grid. In Proceedings 4-th IEEE/ACM Symposium on Cluter Computing and the Grid - BioGrid Workshop, CD-ROM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. He and C.-H. Huang. Communication Efficient BSP Algorithm for All Nearest Smaller Values Problem. Journal of Parallel and Distributed Computing, 61:1425--1438, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Henikoff and J. G. Henikoff. Amino acid Substitution Matrices From Protein Blocks. In Proceedings of Natl. Acad. Sci., 89, pages 10915--10919, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  16. C.-H. Huang. Grid-Enabled Parallel Divide-and-Conquer -- Theory and Practice. In Proceedings of the 17th ACM Symposium on Applied Computing, Madrid, Spain, pages 865--869, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C.-H. Huang. Parallel Pattern Identification in Biological Sequences on Clusters. In Proceedings of the 4th IEEE International Conference on Cluster Computing (IEEE Cluster), pages 127--134, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C.-H. Huang. Bio-Grid: A Collaborative Environment for Life-Science Research. In Proceedings of the 20-th International Symposium on Critical Care and Medicine, pages 123--132, 2005.Google ScholarGoogle Scholar
  19. C.-H. Huang. Bio-Grid: Bridging Life Science and Information Technology. In Proceedings of the 5-th IEEE/ACM Symposium on Cluster Computing and the Grid (BioGrid Workshop), CD-ROM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C.-H. Huang and X. He. Communication-Efficient Bulk Synchronous Parallel Algorithm for Parentheses Matching. In Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, VA. unpaginated, 9 pages, 2001.Google ScholarGoogle Scholar
  21. C.-H. Huang and X. He. Finding Hamiltonian Paths in Tournaments on Clusters -- A Provably Communication-Efficient Approach. In Proceedings of the 16th ACM Symposium on Applied Computing, Las Vegas, pages 549--553, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C.-H. Huang and X. He. Parallel Range Searching in Large Databases Based on General Parallel Prefix Computation. In Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, VA. unpaginated, 3 pages, 2001.Google ScholarGoogle Scholar
  23. C.-H. Huang and S. Rajasekaran. High-Performance Parallel Biocomputing. Parallel Computing Journal, 30(9-10):999--1000, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Lee, A. Abdool, and C.-H. Huang. Pca-based population structure inference with generic clustering algorithms. BMC bioinformatics, 10(Suppl 1):S73, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  25. C. Lee and C.-H. Huang. Searching for transcription factor binding sites in vector spaces. BMC bioinformatics, 13(1):215, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  26. C. Lee and C.-H. Huang. Lasagna: A novel algorithm for transcription factor binding site alignment. BMC bioinformatics, 14(1):108, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  27. C. Lee and C.-H. Huang. Lasagna-search 2.0: integrated transcription factor binding site search and visualization in a browser. Bioinformatics, page btu115, 2014.Google ScholarGoogle Scholar
  28. C. Lee, B. Nkounkou, and C.-H. Huang. Comparison of lda and sprt on clinical dataset classifications. Biomedical informatics insights, 4:1, 2011.Google ScholarGoogle Scholar
  29. C.-W. Lee and C.-H. Huang. Toward Cooperative Genomic Knowledge Inference. Parallel Computing Journal, 30(9-10):1127--1135, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C.-W. Lee, C.-H. Huang, and S. Rajasekaran. TROJAN: A Scalable Parallel Semantic Network System. In Proceedings of the 15th IEEE International Conference on Tools eith Artificial Intelligence, pages 219--223, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. Lindberg, B. Humphreys, and A. McCray. The Unified Medical Language System. Methods Inf. Med., 32(4):281--291, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  32. L. LoConte, S. Brenner, T. Hubbard, C. Chothia, and A. Murzin. SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res., 30:264--267, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  33. N. Losseff, L. Wang, H. Lai, D. Yoo, M. Gawne-Caine, W. McDonald, D. Miller, and A. Thomas. Progressive Cerebral Atrophy in Multiple Sclerosis: A serial MRI study. Brain, 119(6):2009--2019, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  34. H. M. Martinez. An Efficient Method for Finding Repeats in Molecular Sequences. Nucleic Acids Research 11(13), pages 4629--4634, 1983.Google ScholarGoogle Scholar
  35. A. McCray, S. Srinivasan, and A. Browne. Lexical Methods for Managing Variation in Biomedical Terminologies. In Proceedings Annual Symposium Compu. Appl. Med. Care, pages 235--239, 1994.Google ScholarGoogle Scholar
  36. B. Nkounkou, C. Lee, C.-H. Huang, and C. Brown. Biological data classifications with lda and sprt. In Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on, pages 164--168. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  37. W. Pearson. Using the FASTA program to search protein and DNA sequence databases. Methods Mol. Biol., 24:307--331, 1994.Google ScholarGoogle Scholar
  38. S. Quader, N. Snyder, K. Su, E. Mochan, and C.-H. Huang. Ml-consensus: a general consensus model for variable-length transcription factor binding sites. In Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, pages 25--36. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Rajasekaran, S. Balla, C.-H. Huang, V. Thapar, and M. Schiller. Exact Algorithms for Motif Search. Journal of Clinical Monitoring and Computing, 19(4).Google ScholarGoogle Scholar
  40. S. Rajasekaran and C.-H. Huang. A Randomized Algorithm for Distance Matrix Calculations in Multiple Sequence Alignment. In Proceedings of First Knowledge Explorration in Life Science Informatics (Kelsi), LNAI 3303, Springer-Verlag, pages 33--45, 2004.Google ScholarGoogle Scholar
  41. D. Sharma, S. Balla, S. Rajasekaran, and N. DiGirolamo. Degenerate primer selection algorithms. In Computational Intelligence in Bioinformatics and Computational Biology, 2009. CIBCB'09. IEEE Symposium on, pages 155--162. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. K. Stoffel, J. Hendler, J. Saltz, and B. Anderson. Parka on MIMD-Supercomputers. Technical Report CS-TR-3672, Computer Science Dept., UM Institute for Advanced Computer Studies, University of Maryland, College Park, 1996.Google ScholarGoogle Scholar
  43. M. Surdeanu, D. I. Moldovan, and S. M. Harabagiu. Performance Analysis of a Distributed Question/Answering System. IEEE Trans. on Parallel and Distributed Systems, 13(6):579--596, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. R. L. Tatusov, Altschul, S. F., and E. V. Koonin. Detection of Conserved Segments in Proteins: Iterative Scanning of Sequence Databases with Alignment Block. In Proceedings of Natl. Acad. Sci., 91, pages 12091--12095, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  45. N. T. L. Tran, L. DeLuccia, A. F. McDonald, and C.-H. Huang. Cross-disciplinary detection and analysis of network motifs. Bioinformatics and Biology insights, 9:49, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  46. N. T. L. Tran, S. Mohan, Z. Xu, and C.-H. Huang. Current innovations and future challenges of network motif detection. Briefings in Bioinformatics, 16(3):497--525, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  47. C. Wong, Y. Li, C. Lee, and C.-H. Huang. Ensemble learning algorithms for classification of mtdna into haplogroups. Briefings in bioinformatics, 12(1):1--9, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  48. E. Wong, B. Baur, S. Quader, and C.-H. Huang. Biological network motif detection: principles and practice. Briefings in bioinformatics, 13(2):202--215, 2012.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. REU site: bio-grid initiatives for interdisciplinary research and education

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              EduHPC '15: Proceedings of the Workshop on Education for High-Performance Computing
              November 2015
              52 pages
              ISBN:9781450339612
              DOI:10.1145/2831425

              Copyright © 2015 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 15 November 2015

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              EduHPC '15 Paper Acceptance Rate6of15submissions,40%Overall Acceptance Rate6of15submissions,40%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader