research-article

Developing natural language-based program analyses and tools to expedite software maintenance

Author:
Emily Hill

University of Delaware, Newark, DE, USA

University of Delaware, Newark, DE, USA
View Profile

ICSE Companion '08: Companion of the 30th international conference on Software engineeringMay 2008Pages 1015–1018https://doi.org/10.1145/1370175.1370226

Published:10 May 2008Publication History

ICSE Companion '08: Companion of the 30th international conference on Software engineering

Pages 1015–1018

ABSTRACT

With as much as 60-90% of software life cycle resources spent on program maintenance, there is a critical need for automated software tools to help explore and understand today's large and complex software. One important source of information software maintenance tools can draw from is lexical information in comments and identifiers. Identifier names often communicate a programmer's intent when writing code, and help developers map real-world concepts to code during comprehension. My dissertation will develop specialized information retrieval techniques and natural language analyses for software so that software maintenance tools can take full advantage of the wealth of information in program identifiers, and integrate these techniques into software tools to expedite the maintenance activities of program exploration, concern location, and fault localization.

References

R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Google ScholarDigital Library
T. J. Biggerstaff. Design recovery for maintenance and reuse. Computer, 22(7):36--49, 1989. Google ScholarDigital Library
T. J. Biggerstaff, B. G. Mitbander, and D. Webster. The concept assignment problem in program understanding. In Proceedings of the 15th International Conference on Software Engineering, pages 482--498, 1993. Google ScholarDigital Library
B. Boehm. Software engineering. IEEE Transactions on Computers, C-25(12):1226--1241, Dec. 1976. Google ScholarDigital Library
B. Caprile and P. Tonella. Nomen est omen: Analyzing the language of function identifiers. In Proceedings of the Sixth Working Conference on Reverse Engineering, page 112, 1999. Google ScholarDigital Library
H. Cleve and A. Zeller. Locating causes of program failures. In Proceedings of the 27th International Conference on Software engineering, pages 342--351, 2005. Google ScholarDigital Library
V. Dallmeier and T. Zimmermann. Extraction of bug localization benchmarks from history. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, November 2007. Google ScholarDigital Library
F. Deissenboeck and M. Pizka. Concise and consistent naming. Software Quality Control, 14(3):261--282, 2006. Google ScholarDigital Library
M. Eaddy. ConcernTagger case study data. Online, 2008. http://www1.cs.columbia.edu/ eaddy/concerntagger/.Google Scholar
A. D. Eisenberg and K. D. Volder. Dynamic feature traces: Finding features in unfamiliar code. In Proceedings of the 21st IEEE International Conference on Software Maintenance, pages 337--346, 2005. Google ScholarDigital Library
L. Erlikh. Leveraging legacy system dollars for e-business. IT Professional, 2(3):17--23, 2000. Google ScholarDigital Library
M. Fuller, E. Mackie, R. Sacks-Davis, and R. Wilkinson. Structured answers for a large structured document collection. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pages 204--213, 1993. Google ScholarDigital Library
P. Group. PROMISE data repository. Online, 2008. http://promisedata.org/.Google Scholar
E. Hill, L. Pollock, and K. Vijay-Shanker. Exploring the neighborhood with Dora to expedite software maintenance. In Proceedings of the 22nd IEEE International Conference on Automated Software Engineering, 2007. Google ScholarDigital Library
D. Hovemeyer and W. Pugh. Finding bugs is easy. SIGPLAN Not., 39(12):92--106, 2004. Google ScholarDigital Library
J. A. Jones and M. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, pages 273--282, 2005. Google ScholarDigital Library
A. J. Ko, H. Aung, and B. A. Myers. Eliciting design requirements for maintenance-oriented ides: a detailed study of corrective and perfective maintenance tasks. In Proceedings of the 27th International Conference on Software Engineering, pages 126--135, 2005. Google ScholarDigital Library
T. K. Landauer, D. S. McNamara, S. Dennis, and W. Kintsch, editors. Handbook of Latent Semantic Analysis. Erlbaum, Mahwah, NJ, USA, 2007.Google Scholar
B. Liblit, A. Begel, and E. Sweeser. Cognitive perspectives on the role of naming in computer programs. In Proceedings of the 18th Annual Psychology of Programming Workshop, 2006.Google Scholar
G. C. Murphy, M. Kersten, and L. Findlater. How are Java software developers using the Eclipse IDE? IEEE Softw., 23(4):76--83, 2006. Google ScholarDigital Library
D. Poshyvanyk, Y.-G. Gueheneuc, A. Marcus, G. Antoniol, and V. Rajlich. Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans. Softw. Eng., 33(6):420--432, 2007. Google ScholarDigital Library
D. Poshyvanyk, A. Marcus, V. Rajlich, Y.-G. Gueheneuc, and G. Antoniol.Combining probabilistic ranking and latent semantic indexing for feature identification. In Proceedings of the 14th IEEE International Conference on Program Comprehension, pages 137--148, 2006. Google ScholarDigital Library
M. Renieris and S. P. Reiss. Fault localization with nearest neighbor queries. In 18th IEEE International Conference on Automated Software Engineering, pages 30--39, 2003.Google ScholarDigital Library
M. P. Robillard. Automatic generation of suggestions for program investigation. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 11--20, 2005. Google ScholarDigital Library
M. P. Robillard and G. C. Murphy. Concern graphs: finding and describing concerns using structural program dependencies. In Proceedings of the 24th International Conference on Software Engineering, pages 406--416, 2002. Google ScholarDigital Library
M. P. Robillard and G. C. Murphy. Representing concerns in source code. ACM Trans. Softw. Eng. Methodol., 16(1):3, 2007. Google ScholarDigital Library
M. P. Robillard, D. Shepherd, E. Hill, K. Vijay-Shanker, and L. Pollock. An empirical study of the concept assignment problem. Technical Report SOCS-TR-2007.3, School of Computer Science, McGill University, June 2007. http://www.cs.mcgill.ca/ martin/concerns/.Google Scholar
Z. M. Saul, V. Filkov, P. Devanbu, and C. Bird. Recommending random walks. In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 15--24, 2007. Google ScholarDigital Library
A. Schröter, T. Zimmermann, R. Premraj, and A. Zeller. If your bug database could talk\dots. In Proceedings of the 5th International Symposium on Empirical Software Engineering, Volume II: Short Papers and Posters, pages 18--20, September 2006. Available at http://www.st.cs.uni--sb.de/softevo/.Google Scholar
D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker. Using natural language program analysis to locate and understand action-oriented concerns. In Proceedings of the 6th International Conference on Aspect-oriented Software Development, 2007. Google ScholarDigital Library
V. Sinha, D. Karger, and R. Miller. Relo: Helping users manage context during interactive exploratory visualization of large codebases. In Visual Languages and Human-Centric Computing, 2006. Google ScholarDigital Library
L. Tan, D. Yuan, G. Krishna, and Y. Zhou. /*iComment: Bugs or bad comments?*/. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pages 145--158, 2007. Google ScholarDigital Library
F. Tip. A survey of program slicing techniques. Journal of Programming Languages, 3(3):121--189, 1995.Google Scholar
A. Trotman. Choosing document structure weights. Inf. Process. Manage., 41(2):243--264, 2005. Google ScholarDigital Library
N. Wilde and M. C. Scully. Software reconnaissance: mapping program features to code. Journal of Software Maintenance, 7(1):49--62, 1995. Google ScholarDigital Library
A. Williams, W. Thies, and M. D. Ernst. Static deadlock detection for Java libraries. In Object-Oriented Programming, 19th European Conference, pages 602--629, 2005. Google ScholarDigital Library
B. Xu, J. Qian, X. Zhang, Z. Wu, and L. Chen. A brief survey of program slicing. SIGSOFT Software Engineering Notes, 30(2):1--36, 2005. Google ScholarDigital Library

Index Terms

Developing natural language-based program analyses and tools to expedite software maintenance
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software reverse engineering

Recommendations

Using software metrics tools for maintenance decisions: a classroom exercise
SAST '96: Proceedings of the Proceedings of the Fourth International Symposium on Assessment of Software Tools (SAST '96)

We explore the use of software metrics tools to guide software maintenance decisions. A senior undergraduate class was given a copy of QUIPU, an implementation of the X.500 directory standard, and asked to determine which component of the system would be ...
Read More
Exploring the neighborhood with dora to expedite software maintenance
ASE '07: Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering

Completing software maintenance and evolution tasks for today's large, complex software systems can be difficult, often requiring considerable time to understand the system well enough to make correct changes. Despite evidence that successful ...
Read More
Opusdei-Integrated Environment for Software Development and Maintenance
COMPSAC '96: Proceedings of the 20th Conference on Computer Software and Applications

Abstract: This paper discusses an integrated software development and maintenance environment, Opusdei, built and used for the past seven years at Hitachi Software Engineering (HSK) for its various projects. Industrial software is usually large, has ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE Companion '08: Companion of the 30th international conference on Software engineering
May 2008
214 pages
ISBN:9781605580791
DOI:10.1145/1370175
General Chair:
Wilhelm Schäfer
University of Paderborn
,
Program Chairs:
Matthew B. Dwyer
University of Nebraska
,
Volker Gruhn
University of Leipzig
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 May 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
natural language program analysis
program exploration
software maintenance
software tools
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate276of1,856submissions,15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 459
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Developing natural language-based program analyses and tools to expedite software maintenance

ICSE Companion '08: Companion of the 30th international conference on Software engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Using software metrics tools for maintenance decisions: a classroom exercise

Exploring the neighborhood with dora to expedite software maintenance

Opusdei-Integrated Environment for Software Development and Maintenance