skip to main content
10.1145/2901739.2901741acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Interactive exploration of developer interaction traces using a hidden markov model

Published:14 May 2016Publication History

ABSTRACT

Using IDE usage data to analyze the behavior of software developers in the field, during the course of their daily work, can lend support to (or dispute) laboratory studies of developers. This paper describes a technique that leverages Hidden Markov Models (HMMs) as a means of mining high-level developer behavior from low-level IDE interaction traces of many developers in the field. HMMs use dual stochastic processes to model higher-level hidden behavior using observable input sequences of events. We propose an interactive approach of mining interpretable HMMs, based on guiding a human expert in building a high quality HMM in an iterative, one state at a time, manner. The final result is a model that is both representative of the field data and captures the field phenomena of interest. We apply our HMM construction approach to study debugging behavior, using a large IDE interaction dataset collected from nearly 200 developers at ABB, Inc. Our results highlight the different modes and constituent actions in debugging, exhibited by the developers in our dataset.

References

  1. Silvia Bacci, Silvia Pandolfi, and Fulvia Pennoni. A comparison of some criteria for states selection in the latent markov model for longitudinal data. Advances in Data Analysis and Classification, 8(2):125--145, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Gilles Celeux and Jean-Baptiste Durand. Selecting hidden markov model state number with cross-validated likelihood. Computational Statistics, 23(4):541--564, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Christopher S Corley, Federico Lois, and Sebastian Quezada. Web usage patterns of developers. In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on, pages 381--390. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kostadin Damevski, David Shepherd, and Lori Pollock. A field study of how developers locate features in source code. Empirical Software Engineering, pages 1--24, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. The Eclipse Foundation Filtered UDC Data. http://archive.eclipse.org/projects/usagedata, 2016.Google ScholarGoogle Scholar
  6. Szymon Jaroszewicz. Interactive hmm construction based on interesting sequences. In Proc. of Local Patterns to Global Models (LeGo'08) Workshop at the 12th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'08), pages 82--91, 2008.Google ScholarGoogle Scholar
  7. Szymon Jaroszewicz. Using interesting sequences to interactively build hidden markov models. Data Mining and Knowledge Discovery, 21(1):186--220, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ghazaleh Khodabandelou, Charlotte Hug, Rebecca Deneckère, and Camille Salinesi. Unsupervised discovery of intentional process models from event logs. In Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, pages 282--291. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Lawrance, C. Bogart, M. Burnett, R. Bellamy, K. Rector, and S. D. Fleming. How programmers debug, revisited: An information foraging theory perspective. Software Engineering, IEEE Transactions on, 39(2):197--215, Feb 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE '11, pages 311--321. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Roberto Minelli, Andrea Mocci, and Michele Lanza. I know what you did last summer -- an investigation of how developers spend their time. In Proceedings of ICPC 2015 (23rd IEEE International Conference on Program Comprehension), pages 25--35, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. C. Murphy, M. Kersten, and L. Findlater. How are Java software developers using the Eclipse IDE? IEEE Software, 23(4):76--83, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Emerson Murphy-Hill, Rahul Jiresal, and Gail C. Murphy. Improving software developers' fluency by recommending development environment commands. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pages 42:1--42:11. ACM Press, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Emerson Murphy-Hill, Chris Parnin, and Andrew P. Black. How We Refactor, and How We Know It. IEEE Transactions on Software Engineering, 38(1):5--18, January 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Stas Negara, Mihai Codoban, Danny Dig, and Ralph E. Johnson. Mining Fine-grained Code Changes to Detect Unknown Change Patterns. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 803--813. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Piorkowski, S. D. Fleming, C. Scaffidi, M. Burnett, I. Kwan, A. Z. Henley, J. Macbeth, C. Hill, and A. Horvath. To fix or to learn? how production bias affects developers' information foraging during debugging. In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on, pages 11--20, Sept 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, Feb 1989.Google ScholarGoogle ScholarCross RefCross Ref
  18. Vladimir A. Rubin, Alexey A. Mitsyuk, Irina A. Lomazova, and Wil M. P. van der Aalst. Process mining can be applied to software too! In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM '14, pages 57:1--57:8. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David Shepherd, Kostadin Damevski, Bartosz Ropski, and Thomas Fritz. Sando: an extensible local code search framework. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE, pages 15:1--15:2, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Will Snipes, Vinay Augustine, Anil R. Nair, and Emerson M. Hill. Towards recognizing and rewarding efficient developer work patterns. In Proceedings of the 2013 International Conference on Software Engineering, ICSE '13, pages 1277--1280, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mohsen Vakilian and Ralph E. Johnson. Alternate refactoring paths reveal usability problems. pages 1106--1116. ACM Press, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Wil Van Der Aalst. Process mining: discovery, conformance and enhancement of business processes. Springer Science & Business Media, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jinshui Wang, Xin Peng, Zhenchang Xing, and Wenyun Zhao. An Exploratory Study of Feature Location Process: Distinct Phases, Recurring Patterns, and Elementary Actions. In Software Maintenance, IEEE International Conference on, pages 213--222. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Interactive exploration of developer interaction traces using a hidden markov model

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MSR '16: Proceedings of the 13th International Conference on Mining Software Repositories
          May 2016
          544 pages
          ISBN:9781450341868
          DOI:10.1145/2901739

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 May 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Upcoming Conference

          ICSE 2025

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader