ACM Home Page
Please provide us with feedback. Feedback
AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools
Full text PdfPdf (372 KB)
Source
International Conference on Software Engineering archive
Proceedings of the 2008 international working conference on Mining software repositories table of contents
Leipzig, Germany
SESSION: Mining 2 table of contents
Pages 79-88  
Year of Publication: 2008
ISBN:978-1-60558-024-1
Authors
Emily Hill  University of Delaware, Newark, DE, USA
Zachary P. Fry  University of Delaware, Newark, DE, USA
Haley Boyd  University of Delaware, Newark, DE, USA
Giriprasad Sridhara  University of Delaware, Newark, DE, USA
Yana Novikova  University of Delaware, Newark, DE, USA
Lori Pollock  University of Delaware, Newark, DE, USA
K. Vijay-Shanker  University of Delaware, Newark, DE, USA
Sponsors
SIGSOFT: ACM Special Interest Group on Software Engineering
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 44,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1370750.1370771
What is a DOI?

ABSTRACT

When writing software, developers often employ abbreviations in identifier names. In fact, some abbreviations may never occur with the expanded word, or occur more often in the code. However, most existing program comprehension and search tools do little to address the problem of abbreviations, and therefore may miss meaningful pieces of code or relationships between software artifacts. In this paper, we present an automated approach to mining abbreviation expansions from source code to enhance software maintenance tools that utilize natural language information. Our scoped approach uses contextual information at the method, program, and general software level to automatically select the most appropriate expansion for a given abbreviation. We evaluated our approach on a set of 250 potential abbreviations and found that our scoped approach provides a 57% improvement in accuracy over the current state of the art.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
G. Antoniol, G. Canfora, G. Casazza, A. D. Lucia, and E. Merlo. Recovering traceability links between code and documentation. IEEE Trans. Soft. Eng., 28(10):970---983, 2002.
 
2
J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In Proc. Inter. Conf. Soft. Eng., 2006.
 
3
B. Caprile and P. Tonella. Restructuring program identifier names. In Proc. Inter. Conf. Soft. Maintenance, 2000.
 
4
H. Feild, D. Binkley, and D. Lawrie. An empirical comparison of techniques for extracting concept abbreviations from identifiers. In Proc. Inter. Conf. Soft. Eng. and Applications, 2006.
 
5
F. Feng and W. B. Croft. Probabilistic techniques for phrase extraction. Inf. Process. Manage., 37(2):199--220, 2001.
 
6
E. Hill, L. Pollock, and K. Vijay-Shanker. Exploring the neighborhood with Dora to expedite software maintenance. In Proc. Inter. Conf. Auto. Soft. Eng., 2007.
 
7
D. Jurafsky and J. H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall PTR, 2000.
 
8
L. S. Larkey, P. Ogilvie, M. A. Price, and B. Tamilio. Acrophile: an automated acronym extractor and server. In Proc. Conf. Digital Libraries, 2000.
 
9
D. Lawrie, H. Feild, and D. Binkley. Extracting meaning from abbreviated identifiers. In Proc. Inter. Working Conf. Source Code Analysis and Manipulation, 2007.
 
10
B. Liblit, A. Begel, and E. Sweeser. Cognitive perspectives on the role of naming in computer programs. In Proc. Annual Psychology Programming Workshop, 2006.
 
11
C. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
 
12
A. Marcus and J. I. Maletic. Recovering documentation-to-source-code traceability links using latent semantic indexing. In Proc. Inter. Conf. Soft. Eng., 2003.
 
13
A. Marcus, A. Sergeyev, V. Rajlich, and J. I. Maletic. An information retrieval approach to concept location in source code. In Proc. Working Conf. Reverse Eng., 2004.
 
14
S. Pakhomov. Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In Proc. Association for Computational Linguistics, 2001.
 
15
M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
 
16
P. Runeson, M. Alexandersson, and O. Nyholm. Detection of duplicate defect reports using natural language processing. In Proc. Inter. Conf. Soft. Eng., 2007.
 
17
D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker. Using natural language program analysis to locate and understand action--oriented concerns. In Proc. Inter. Conf. Aspect--oriented Soft. Devel., 2007.
 
18
C. Simonyi. Hungarian notation. In Visual Studio 6.0 Technical Articles. Microsoft Corporation. Reprinted 1999.
 
19
W. Zhao, L. Zhang, Y. Liu, J. Sun, and F. Yang. SNIAFL: Towards a static non-interactive approach to feature location. ACM Trans. Soft. Eng. and Methodology, 15(2):195--226, 2006.

Collaborative Colleagues:
Emily Hill: colleagues
Zachary P. Fry: colleagues
Haley Boyd: colleagues
Giriprasad Sridhara: colleagues
Yana Novikova: colleagues
Lori Pollock: colleagues
K. Vijay-Shanker: colleagues