skip to main content
10.1145/1390630.1390647acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

A metric for software readability

Published: 20 July 2008 Publication History

Abstract

In this paper, we explore the concept of code readability and investigate its relation to software quality. With data collected from human annotators, we derive associations between a simple set of local code features and human notions of readability. Using those features, we construct an automated readability measure and show that it can be 80% effective, and better than a human on average, at predicting readability judgments. Furthermore, we show that this metric correlates strongly with two traditional measures of software quality, code changes and defect reports. Finally, we discuss the implications of this study on programming language design and engineering practice. For example, our data suggests that comments, in of themselves, are less important than simple blank lines to local judgments of readability.

References

[1]
K. Aggarwal, Y. Singh, and J. K. Chhabra. An integrated measure of software maintainability. Reliability and Maintainability Symposium, 2002. Proceedings. Annual, pages 235--241, September 2002.
[2]
S. Ambler. Java coding standards. Softw. Dev., 5(8):67--71, 1997.
[3]
P. Anderson and T. Teitelbaum. Software inspection using codesurfer. WISE '01: Proceeding of the First workshop on inspection in software engineering, July 2001.
[4]
B. B. Bederson, B. Shneiderman, and M. Wattenberg. Ordered and quantum treemaps: Making effective use 2d space to display hierarchies. ACM Trans. Graph., 21(4):833--854, 2002.
[5]
B. Boehm and V. R. Basili. Software defect reduction top 10 list. Computer, 34(1):135{137, 2001.
[6]
L. W. Cannon, R. A. Elliott, L. W. Kirchho, J. H. Miller, J. M. Milner, R. W. Mitze, E. P. Schan, N. O. Whittington, H. Spencer, D. Keppel, and M. Brader. Recommended C Style and Coding Standards: Revision 6.0. Specialized Systems Consultants, Inc., Seattle, Washington, June 1990.
[7]
T. J. Cheatham, J. P. Yoo, and N. J. Wahl. Software testing: a machine learning experiment. In CSC '95: Proceedings of the 1995 ACM 23rd annual conference on Computer science, pages 135--141, 1995.
[8]
T. Y. Chen, F.-C. Kuo, and R. Merkel. On the statistical properties of the f-measure. In QSIC'04: Fourth International Conference on Quality Software, pages 146--153, 2004.
[9]
T. Copeland. PMD Applied. Centennial Books, Alexandria, VA, USA, 2005.
[10]
E. W. Dijkstra. A Discipline of Programming. Prentice Hall PTR, 1976.
[11]
J. L. Elshoff and M. Marcotty. Improving computer program readability to aid modification. Commun. ACM, 25(8):512--521, 1982.
[12]
R. F. Flesch. A new readability yardstick. Journal of Applied Psychology, 32:221--233, 1948.
[13]
J. Frederick P. Brooks. No silver bullet: essence and accidents of software engineering. Computer, 20(4):10--19, 1987.
[14]
J. Frederick P. Brooks. The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional, August 1995.
[15]
A. Goncalves. Get acquainted with the new advanced features of junit 4. DevX http://www.devx.com/Java/Article/31983, 2006.
[16]
J. Gosling, B. Joy, and G. L. Steele. The Java Language Specification. The Java Series. Addison--Wesley, Reading, MA, USA, 1996.
[17]
T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng., 26(7):653--661, 2000.
[18]
R. Gunning. The Technique of Clear Writing. McGraw-Hill International Book Co, New York, 1952.
[19]
N. J. Haneef. Software documentation and readability: a proposed process improvement. SIGSOFT Softw. Eng. Notes, 23(3):75--77, 1998.
[20]
A. E. Hatzimanikatis, C. T. Tsalidis, and D. Christodoulakis. Measuring the readability and maintainability of hyperdocuments. Journal of Software Maintenance, 7(2):77--90, 1995.
[21]
G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. Proceedings of the Second Australia and New Zealand Conference on Intelligent Information Systems, 1994.
[22]
D. Hovemeyer and W. Pugh. Finding bugs is easy. SIGPLAN Not., 39(12):92--106, 2004.
[23]
T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, and J. McMullan. Detection of software modules with high debug code churn in a very large legacy system. In ISSRE '96: Proceedings of the The Seventh International Symposium on Software Reliability Engineering (ISSRE '96), page 364, Washington, DC, USA, 1996. IEEE Computer Society.
[24]
J. P. Kinciad and E. A. Smith. Derivation and validation of the automated readability index for use with technical materials. Human Factors, 12:457--464, 1970.
[25]
P. Knab, M. Pinzger, and A. Bernstein. Predicting defect densities in source code files with decision tree learners. In MSR '06: Proceedings of the 2006 international workshop on Mining software repositories, pages 119--125, 2006.
[26]
J. C. Knight and E. A. Myers. Phased inspections and their implementation. SIGSOFT Softw. Eng. Notes, 16(3):29--35, 1991.
[27]
R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence, 14(2):1137--1145, 1995.
[28]
R. Likert. A technique for the measurement of attitudes. Archives of Psychology, 140:44{53, 1932.
[29]
J. Lionel E. Deimel. The uses of program reading. SIGCSE Bull., 17(2):5--14, 1985.
[30]
S. MacHaffie, R. McLeod, B. Roberts, P. Todd, and L. Anderson. A readability metric for computer-generated mathematics. Technical report, Saltire Software, http://www.saltire.com/equation.html, retrieved 2007.
[31]
G. H. McLaughlin. Smog grading -- a new readability. Journal of Reading, May 1969.
[32]
R. J. Miara, J. A. Musselman, J. A. Navarro, and B. Shneiderman. Program indentation and comprehensibility. Commun. ACM, 26(11):861--867, 1983.
[33]
T. Mitchell. Machine Learning. McGraw Hill, 1997.
[34]
N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In ICSE'05: Proceedings of the 27th international conference on Software engineering, pages 284--292, 2005.
[35]
C. V. Ramamoorthy and W.-T. Tsai. Advances in software engineering. Computer, 29(10):47--58, 1996.
[36]
D. R. Raymond. Reading source code. In CASCON'91: Proceedings of the 1991 conference of the Centre for Advanced Studies on Collaborative research, pages 3--16. IBM Press, 1991.
[37]
P. A. Relf. Tool assisted identifier naming for improved software readability: an empirical study. Empirical Software Engineering, 2005. 2005 International Symposium on, November 2005.
[38]
S. Rugaber. The use of domain knowledge in program understanding. Ann. Softw. Eng., 9(1-4):143--192, 2000.
[39]
C. Simonyi. Hungarian notation. MSDN Library, November 1999.
[40]
S. E. Stemler. A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Practical Assessment, Research and Evaluation, 9(4), 2004.
[41]
H. Sutter and A. Alexandrescu. C++ Coding Standards: 101 Rules, Guidelines, and Best Practices. Addison-Wesley Professional, 2004.
[42]
T. Tenny. Program readability: Procedures versus comments. IEEE Trans. Softw. Eng., 14(9):1271--1279, 1988.
[43]
A. Watters, G. van Rossum, and J. C. Ahlstrom. Internet Programming with Python. MIS Press/Henry Holt publishers, New York, 1996.
[44]
E. J. Weyuker. Evaluating software complexity measures. IEEE Trans. Softw. Eng., 14(9):1357--1365, 1988.

Cited By

View all
  • (2024)Reassessing Java Code Readability Models with a Human-Centered ApproachProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644435(225-235)Online publication date: 15-Apr-2024
  • (2024)Enhancing Identifier Naming Through Multi-Mask Fine-Tuning of Language Models of Code2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM63643.2024.00017(71-82)Online publication date: 7-Oct-2024
  • (2024)Replication of a Study about the Impact of Method Chaining and Comments on Readability and Comprehension2024 4th International Conference on Code Quality (ICCQ)10.1109/ICCQ60895.2024.10576941(35-52)Online publication date: 22-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA '08: Proceedings of the 2008 international symposium on Software testing and analysis
July 2008
324 pages
ISBN:9781605580500
DOI:10.1145/1390630
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FindBugs
  2. code metrics
  3. machine learning
  4. program understanding
  5. software maintenance
  6. software readability

Qualifiers

  • Research-article

Conference

ISSTA '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)129
  • Downloads (Last 6 weeks)18
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Reassessing Java Code Readability Models with a Human-Centered ApproachProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644435(225-235)Online publication date: 15-Apr-2024
  • (2024)Enhancing Identifier Naming Through Multi-Mask Fine-Tuning of Language Models of Code2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM63643.2024.00017(71-82)Online publication date: 7-Oct-2024
  • (2024)Replication of a Study about the Impact of Method Chaining and Comments on Readability and Comprehension2024 4th International Conference on Code Quality (ICCQ)10.1109/ICCQ60895.2024.10576941(35-52)Online publication date: 22-Jun-2024
  • (2024)Reproducibility of issues reported in stack overflow questions: Challenges, impact & estimationJournal of Systems and Software10.1016/j.jss.2024.112158217(112158)Online publication date: Nov-2024
  • (2024)An eye tracking study assessing source code readability rules for program comprehensionEmpirical Software Engineering10.1007/s10664-024-10532-x29:6Online publication date: 5-Oct-2024
  • (2024)Investigating the readability of test codeEmpirical Software Engineering10.1007/s10664-023-10390-z29:2Online publication date: 26-Feb-2024
  • (2024)An Interactive Tool to Improve Program Readability for Novice StudentsNew Technology in Education and Training10.1007/978-981-97-3883-0_13(145-157)Online publication date: 15-Aug-2024
  • (2023)User-Customizable Transpilation of Scripting LanguagesProceedings of the ACM on Programming Languages10.1145/35860347:OOPSLA1(201-229)Online publication date: 6-Apr-2023
  • (2023)Cross-Project Transfer Learning on Lightweight Code Semantic Graphs for Defect PredictionInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402350026233:07(1095-1117)Online publication date: 6-Jul-2023
  • (2023)Spork: Structured Merge for Java With Formatting PreservationIEEE Transactions on Software Engineering10.1109/TSE.2022.314376649:1(64-83)Online publication date: 1-Jan-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media