skip to main content
10.1145/1370175.1370226acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Developing natural language-based program analyses and tools to expedite software maintenance

Published:10 May 2008Publication History

ABSTRACT

With as much as 60-90% of software life cycle resources spent on program maintenance, there is a critical need for automated software tools to help explore and understand today's large and complex software. One important source of information software maintenance tools can draw from is lexical information in comments and identifiers. Identifier names often communicate a programmer's intent when writing code, and help developers map real-world concepts to code during comprehension. My dissertation will develop specialized information retrieval techniques and natural language analyses for software so that software maintenance tools can take full advantage of the wealth of information in program identifiers, and integrate these techniques into software tools to expedite the maintenance activities of program exploration, concern location, and fault localization.

References

  1. R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. J. Biggerstaff. Design recovery for maintenance and reuse. Computer, 22(7):36--49, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. J. Biggerstaff, B. G. Mitbander, and D. Webster. The concept assignment problem in program understanding. In Proceedings of the 15th International Conference on Software Engineering, pages 482--498, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Boehm. Software engineering. IEEE Transactions on Computers, C-25(12):1226--1241, Dec. 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Caprile and P. Tonella. Nomen est omen: Analyzing the language of function identifiers. In Proceedings of the Sixth Working Conference on Reverse Engineering, page 112, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Cleve and A. Zeller. Locating causes of program failures. In Proceedings of the 27th International Conference on Software engineering, pages 342--351, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. V. Dallmeier and T. Zimmermann. Extraction of bug localization benchmarks from history. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, November 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. Deissenboeck and M. Pizka. Concise and consistent naming. Software Quality Control, 14(3):261--282, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Eaddy. ConcernTagger case study data. Online, 2008. http://www1.cs.columbia.edu/ eaddy/concerntagger/.Google ScholarGoogle Scholar
  10. A. D. Eisenberg and K. D. Volder. Dynamic feature traces: Finding features in unfamiliar code. In Proceedings of the 21st IEEE International Conference on Software Maintenance, pages 337--346, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Erlikh. Leveraging legacy system dollars for e-business. IT Professional, 2(3):17--23, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Fuller, E. Mackie, R. Sacks-Davis, and R. Wilkinson. Structured answers for a large structured document collection. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pages 204--213, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Group. PROMISE data repository. Online, 2008. http://promisedata.org/.Google ScholarGoogle Scholar
  14. E. Hill, L. Pollock, and K. Vijay-Shanker. Exploring the neighborhood with Dora to expedite software maintenance. In Proceedings of the 22nd IEEE International Conference on Automated Software Engineering, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Hovemeyer and W. Pugh. Finding bugs is easy. SIGPLAN Not., 39(12):92--106, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. A. Jones and M. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, pages 273--282, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. J. Ko, H. Aung, and B. A. Myers. Eliciting design requirements for maintenance-oriented ides: a detailed study of corrective and perfective maintenance tasks. In Proceedings of the 27th International Conference on Software Engineering, pages 126--135, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. K. Landauer, D. S. McNamara, S. Dennis, and W. Kintsch, editors. Handbook of Latent Semantic Analysis. Erlbaum, Mahwah, NJ, USA, 2007.Google ScholarGoogle Scholar
  19. B. Liblit, A. Begel, and E. Sweeser. Cognitive perspectives on the role of naming in computer programs. In Proceedings of the 18th Annual Psychology of Programming Workshop, 2006.Google ScholarGoogle Scholar
  20. G. C. Murphy, M. Kersten, and L. Findlater. How are Java software developers using the Eclipse IDE? IEEE Softw., 23(4):76--83, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Poshyvanyk, Y.-G. Gueheneuc, A. Marcus, G. Antoniol, and V. Rajlich. Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans. Softw. Eng., 33(6):420--432, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Poshyvanyk, A. Marcus, V. Rajlich, Y.-G. Gueheneuc, and G. Antoniol.Combining probabilistic ranking and latent semantic indexing for feature identification. In Proceedings of the 14th IEEE International Conference on Program Comprehension, pages 137--148, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Renieris and S. P. Reiss. Fault localization with nearest neighbor queries. In 18th IEEE International Conference on Automated Software Engineering, pages 30--39, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. P. Robillard. Automatic generation of suggestions for program investigation. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 11--20, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. P. Robillard and G. C. Murphy. Concern graphs: finding and describing concerns using structural program dependencies. In Proceedings of the 24th International Conference on Software Engineering, pages 406--416, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. P. Robillard and G. C. Murphy. Representing concerns in source code. ACM Trans. Softw. Eng. Methodol., 16(1):3, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. P. Robillard, D. Shepherd, E. Hill, K. Vijay-Shanker, and L. Pollock. An empirical study of the concept assignment problem. Technical Report SOCS-TR-2007.3, School of Computer Science, McGill University, June 2007. http://www.cs.mcgill.ca/ martin/concerns/.Google ScholarGoogle Scholar
  28. Z. M. Saul, V. Filkov, P. Devanbu, and C. Bird. Recommending random walks. In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 15--24, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Schröter, T. Zimmermann, R. Premraj, and A. Zeller. If your bug database could talk\dots. In Proceedings of the 5th International Symposium on Empirical Software Engineering, Volume II: Short Papers and Posters, pages 18--20, September 2006. Available at http://www.st.cs.uni--sb.de/softevo/.Google ScholarGoogle Scholar
  30. D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker. Using natural language program analysis to locate and understand action-oriented concerns. In Proceedings of the 6th International Conference on Aspect-oriented Software Development, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. V. Sinha, D. Karger, and R. Miller. Relo: Helping users manage context during interactive exploratory visualization of large codebases. In Visual Languages and Human-Centric Computing, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. Tan, D. Yuan, G. Krishna, and Y. Zhou. /*iComment: Bugs or bad comments?*/. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pages 145--158, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Tip. A survey of program slicing techniques. Journal of Programming Languages, 3(3):121--189, 1995.Google ScholarGoogle Scholar
  34. A. Trotman. Choosing document structure weights. Inf. Process. Manage., 41(2):243--264, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. N. Wilde and M. C. Scully. Software reconnaissance: mapping program features to code. Journal of Software Maintenance, 7(1):49--62, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Williams, W. Thies, and M. D. Ernst. Static deadlock detection for Java libraries. In Object-Oriented Programming, 19th European Conference, pages 602--629, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. B. Xu, J. Qian, X. Zhang, Z. Wu, and L. Chen. A brief survey of program slicing. SIGSOFT Software Engineering Notes, 30(2):1--36, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Developing natural language-based program analyses and tools to expedite software maintenance

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICSE Companion '08: Companion of the 30th international conference on Software engineering
      May 2008
      214 pages
      ISBN:9781605580791
      DOI:10.1145/1370175

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 May 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate276of1,856submissions,15%

      Upcoming Conference

      ICSE 2025

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader