ACM Home Page
Please provide us with feedback. Feedback
Summarizing developer work history using time series segmentation: challenge report
Full text PdfPdf (229 KB)
Source
International Conference on Software Engineering archive
Proceedings of the 2008 international working conference on Mining software repositories table of contents
Leipzig, Germany
SESSION: Mining challenge results table of contents
Pages 137-140  
Year of Publication: 2008
ISBN:978-1-60558-024-1
Authors
Harvey Siy  University of Nebraska at Omaha, Omaha, NE, USA
Parvathi Chundi  University of Nebraska at Omaha, Omaha, NE, USA
Mahadevan Subramaniam  University of Nebraska at Omaha, Omaha, NE, USA
Sponsors
SIGSOFT: ACM Special Interest Group on Software Engineering
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 47,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1370750.1370784
What is a DOI?

ABSTRACT

Temporal segmentation partitions time series data with the intent of producing more homogeneous segments. It is a technique used to preprocess data so that subsequent time series analysis on individual segments can detect trends that may not be evident when performing time series analysis on the entire dataset.

This technique allows data miners to partition a large dataset without making any assumption of periodicity or any other a priori knowledge of the dataset's features.

We investigate the insights that can be gained from the application of time series segmentation to software version repositories. Software version repositories from large projects contain on the order of hundreds of thousands of timestamped entries or more. It is a continuing challenge to aggregate such data so that noise is reduced and important characteristics are brought out.

In this paper, we present a way to summarize developer work history in terms of the files they have modified over time by segmenting the CVS change data of individual Eclipse developers. We show that the files they modify tends to change significantly over time though most of them tend to work within the same directories.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Gionis and H. Mannila. Segmentation algorithms for time series and sequence data. In A Tutorial in the SIAM International Conference on Data Mining (SDM 2005). Society for Industrial and Applied Mathematics: Philadelphia PA, 2005.
 
2
L. Hiew, G. C. Murphy, and J. Anvik. Who should fix this bug? In Proc. of Intl. Conference on Software Engineering (ICSE), 2006.
 
3
H. Siy, P. Chundi, D. Rosenkrantz, and M. Subramaniam. Discovering dynamic developer relationships from software version histories by time series segmentation. In Proceedings of the 23rd International Conference on Software Maintenance (ICSM 2007), 2007.

Collaborative Colleagues:
Harvey Siy: colleagues
Parvathi Chundi: colleagues
Mahadevan Subramaniam: colleagues