| Detecting similar Java classes using tree algorithms |
| Full text |
Pdf
(393 KB)
|
| Source
|
International Conference on Software Engineering
archive
Proceedings of the 2006 international workshop on Mining software repositories
table of contents
Shanghai, China
SESSION: Matching
table of contents
Pages: 65 - 71
Year of Publication: 2006
ISBN:1-59593-397-2
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 21, Downloads (12 Months): 98, Citation Count: 5
|
|
|
ABSTRACT
Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code while typing and that suggests similar code examples or existing implementations from a source code repository. Mining software repositories by means of similarity measures enables and enforces reusing existing code and reduces the developing effort needed by creating a shared knowledge base of code fragments. In information retrieval similarity measures are often used to find documents similar to a given query document. This paper extends this idea to source code repositories. It introduces our approach to detect similar Java classes in software projects using tree similarity algorithms. We show how our approach allows to find similar Java classes based on an evaluation of three tree-based similarity measures in the context of five user-defined test cases as well as a preliminary software evolution analysis of a medium-sized Java project. Initial results of our technique indicate that it (1) is indeed useful to identify similar Java classes, (2)successfully identifies the ex ante and ex post versions of refactored classes, and (3) provides some interesting insights into within-version and between-version dependencies of classes within a Java project.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
Amir Michail , David Notkin, Assessing software libraries by browsing similar classes, functions and relationships, Proceedings of the 21st international conference on Software engineering, p.463-472, May 16-22, 1999, Los Angeles, California, United States
[doi> 10.1145/302405.302678]
|
| |
7
|
G. Mishne and M. de Rijke. Source Code Retrieval using Conceptual Similarity. 2004.
|
 |
8
|
|
 |
9
|
|
| |
10
|
Object Technology International, Inc. Eclipse Platform Technical Overview. 2003.
|
 |
11
|
|
 |
12
|
|
| |
13
|
S. Tichelaar. FAMIX Java Language Plug-in 1.0.1999.
|
| |
14
|
S. Tichelaar, P. Steyaert, and S. Demeyer. FAMIX 2.0: The FAMOOS Information Exchange Model. 1999.
|
| |
15
|
G. Valiente. Simple and Efficient Tree Pattern Matching. Technical Report LSI-00-72-R, Technical University of Catalonia, Dec.2000.
|
| |
16
|
|
| |
17
|
Jason T. L. Wang , Huiyuan Shan , Dennis Shasha , William H. Piel, TreeRank: a similarity measure for nearest neighbor searching in phylogenetic database, Proceedings of the 15th international conference on Scientific and statistical database management, p.171-182, July 09-11, 2003, Cambridge, MA
[doi> 10.1109/SSDM.2003.1214978]
|
| |
18
|
|
|