skip to main content
10.1145/1370750.1370771acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools

Published: 10 May 2008 Publication History

Abstract

When writing software, developers often employ abbreviations in identifier names. In fact, some abbreviations may never occur with the expanded word, or occur more often in the code. However, most existing program comprehension and search tools do little to address the problem of abbreviations, and therefore may miss meaningful pieces of code or relationships between software artifacts. In this paper, we present an automated approach to mining abbreviation expansions from source code to enhance software maintenance tools that utilize natural language information. Our scoped approach uses contextual information at the method, program, and general software level to automatically select the most appropriate expansion for a given abbreviation. We evaluated our approach on a set of 250 potential abbreviations and found that our scoped approach provides a 57% improvement in accuracy over the current state of the art.

References

[1]
G. Antoniol, G. Canfora, G. Casazza, A. D. Lucia, and E. Merlo. Recovering traceability links between code and documentation. IEEE Trans. Soft. Eng., 28(10):970---983, 2002.
[2]
J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In Proc. Inter. Conf. Soft. Eng., 2006.
[3]
B. Caprile and P. Tonella. Restructuring program identifier names. In Proc. Inter. Conf. Soft. Maintenance, 2000.
[4]
H. Feild, D. Binkley, and D. Lawrie. An empirical comparison of techniques for extracting concept abbreviations from identifiers. In Proc. Inter. Conf. Soft. Eng. and Applications, 2006.
[5]
F. Feng and W. B. Croft. Probabilistic techniques for phrase extraction. Inf. Process. Manage., 37(2):199--220, 2001.
[6]
E. Hill, L. Pollock, and K. Vijay-Shanker. Exploring the neighborhood with Dora to expedite software maintenance. In Proc. Inter. Conf. Auto. Soft. Eng., 2007.
[7]
D. Jurafsky and J. H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall PTR, 2000.
[8]
L. S. Larkey, P. Ogilvie, M. A. Price, and B. Tamilio. Acrophile: an automated acronym extractor and server. In Proc. Conf. Digital Libraries, 2000.
[9]
D. Lawrie, H. Feild, and D. Binkley. Extracting meaning from abbreviated identifiers. In Proc. Inter. Working Conf. Source Code Analysis and Manipulation, 2007.
[10]
B. Liblit, A. Begel, and E. Sweeser. Cognitive perspectives on the role of naming in computer programs. In Proc. Annual Psychology Programming Workshop, 2006.
[11]
C. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
[12]
A. Marcus and J. I. Maletic. Recovering documentation-to-source-code traceability links using latent semantic indexing. In Proc. Inter. Conf. Soft. Eng., 2003.
[13]
A. Marcus, A. Sergeyev, V. Rajlich, and J. I. Maletic. An information retrieval approach to concept location in source code. In Proc. Working Conf. Reverse Eng., 2004.
[14]
S. Pakhomov. Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In Proc. Association for Computational Linguistics, 2001.
[15]
M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
[16]
P. Runeson, M. Alexandersson, and O. Nyholm. Detection of duplicate defect reports using natural language processing. In Proc. Inter. Conf. Soft. Eng., 2007.
[17]
D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker. Using natural language program analysis to locate and understand action--oriented concerns. In Proc. Inter. Conf. Aspect--oriented Soft. Devel., 2007.
[18]
C. Simonyi. Hungarian notation. In Visual Studio 6.0 Technical Articles. Microsoft Corporation. Reprinted 1999.
[19]
W. Zhao, L. Zhang, Y. Liu, J. Sun, and F. Yang. SNIAFL: Towards a static non-interactive approach to feature location. ACM Trans. Soft. Eng. and Methodology, 15(2):195--226, 2006.

Cited By

View all
  • (2024)Shortening Overlong Method Names with AbbreviationsACM Transactions on Software Engineering and Methodology10.1145/367695933:8(1-24)Online publication date: 8-Jul-2024
  • (2024)On Using GUI Interaction Data to Improve Text Retrieval-based Bug LocalizationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3608139(1-13)Online publication date: 20-May-2024
  • (2023)Beyond Literal Meaning: Uncover and Explain Implicit Knowledge in Code Through Wikipedia-Based Concept LinkingIEEE Transactions on Software Engineering10.1109/TSE.2023.325002949:5(3226-3240)Online publication date: 1-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '08: Proceedings of the 2008 international working conference on Mining software repositories
May 2008
162 pages
ISBN:9781605580241
DOI:10.1145/1370750
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 May 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic abbreviation expansion
  2. program comprehension
  3. software maintenance
  4. software tools

Qualifiers

  • Research-article

Conference

ICSE '08
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Shortening Overlong Method Names with AbbreviationsACM Transactions on Software Engineering and Methodology10.1145/367695933:8(1-24)Online publication date: 8-Jul-2024
  • (2024)On Using GUI Interaction Data to Improve Text Retrieval-based Bug LocalizationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3608139(1-13)Online publication date: 20-May-2024
  • (2023)Beyond Literal Meaning: Uncover and Explain Implicit Knowledge in Code Through Wikipedia-Based Concept LinkingIEEE Transactions on Software Engineering10.1109/TSE.2023.325002949:5(3226-3240)Online publication date: 1-May-2023
  • (2023)BEQAIN: An Effective and Efficient Identifier Normalization Approach With BERT and the Question Answering SystemIEEE Transactions on Software Engineering10.1109/TSE.2022.322755949:4(2597-2620)Online publication date: 1-Apr-2023
  • (2023)Generating Variable Explanations via Zero-shot Prompt Learning2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00130(748-760)Online publication date: 11-Sep-2023
  • (2023)Automated variable renaming: are we there yet?Empirical Software Engineering10.1007/s10664-022-10274-828:2Online publication date: 14-Feb-2023
  • (2022)Retrieving data constraint implementations using fine-grained code patternsProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510167(1893-1905)Online publication date: 21-May-2022
  • (2022)An Ensemble Approach for Annotating Source Code Identifiers With Part-of-Speech TagsIEEE Transactions on Software Engineering10.1109/TSE.2021.309824248:9(3506-3522)Online publication date: 1-Sep-2022
  • (2022)Automated Expansion of Abbreviations Based on Semantic Relation and Transfer ExpansionIEEE Transactions on Software Engineering10.1109/TSE.2020.299573648:2(519-537)Online publication date: 1-Feb-2022
  • (2022)Empirical Study of Co-Renamed Identifiers2022 29th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC57359.2022.00019(71-80)Online publication date: Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media