research-article

A metric for software readability

Authors:

Raymond P.L. Buse,

Westley R. WeimerAuthors Info & Claims

ISSTA '08: Proceedings of the 2008 international symposium on Software testing and analysis

Pages 121 - 130

https://doi.org/10.1145/1390630.1390647

Published: 20 July 2008 Publication History

Abstract

In this paper, we explore the concept of code readability and investigate its relation to software quality. With data collected from human annotators, we derive associations between a simple set of local code features and human notions of readability. Using those features, we construct an automated readability measure and show that it can be 80% effective, and better than a human on average, at predicting readability judgments. Furthermore, we show that this metric correlates strongly with two traditional measures of software quality, code changes and defect reports. Finally, we discuss the implications of this study on programming language design and engineering practice. For example, our data suggests that comments, in of themselves, are less important than simple blank lines to local judgments of readability.

References

[1]

K. Aggarwal, Y. Singh, and J. K. Chhabra. An integrated measure of software maintainability. Reliability and Maintainability Symposium, 2002. Proceedings. Annual, pages 235--241, September 2002.

[2]

S. Ambler. Java coding standards. Softw. Dev., 5(8):67--71, 1997.

Digital Library

[3]

P. Anderson and T. Teitelbaum. Software inspection using codesurfer. WISE '01: Proceeding of the First workshop on inspection in software engineering, July 2001.

[4]

B. B. Bederson, B. Shneiderman, and M. Wattenberg. Ordered and quantum treemaps: Making effective use 2d space to display hierarchies. ACM Trans. Graph., 21(4):833--854, 2002.

Digital Library

[5]

B. Boehm and V. R. Basili. Software defect reduction top 10 list. Computer, 34(1):135{137, 2001.

Digital Library

[6]

L. W. Cannon, R. A. Elliott, L. W. Kirchho, J. H. Miller, J. M. Milner, R. W. Mitze, E. P. Schan, N. O. Whittington, H. Spencer, D. Keppel, and M. Brader. Recommended C Style and Coding Standards: Revision 6.0. Specialized Systems Consultants, Inc., Seattle, Washington, June 1990.

[7]

T. J. Cheatham, J. P. Yoo, and N. J. Wahl. Software testing: a machine learning experiment. In CSC '95: Proceedings of the 1995 ACM 23rd annual conference on Computer science, pages 135--141, 1995.

Digital Library

[8]

T. Y. Chen, F.-C. Kuo, and R. Merkel. On the statistical properties of the f-measure. In QSIC'04: Fourth International Conference on Quality Software, pages 146--153, 2004.

Digital Library

[9]

T. Copeland. PMD Applied. Centennial Books, Alexandria, VA, USA, 2005.

[10]

E. W. Dijkstra. A Discipline of Programming. Prentice Hall PTR, 1976.

Digital Library

[11]

J. L. Elshoff and M. Marcotty. Improving computer program readability to aid modification. Commun. ACM, 25(8):512--521, 1982.

Digital Library

[12]

R. F. Flesch. A new readability yardstick. Journal of Applied Psychology, 32:221--233, 1948.

[13]

J. Frederick P. Brooks. No silver bullet: essence and accidents of software engineering. Computer, 20(4):10--19, 1987.

Digital Library

[14]

J. Frederick P. Brooks. The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional, August 1995.

Digital Library

[15]

A. Goncalves. Get acquainted with the new advanced features of junit 4. DevX http://www.devx.com/Java/Article/31983, 2006.

[16]

J. Gosling, B. Joy, and G. L. Steele. The Java Language Specification. The Java Series. Addison--Wesley, Reading, MA, USA, 1996.

Digital Library

[17]

T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng., 26(7):653--661, 2000.

Digital Library

[18]

R. Gunning. The Technique of Clear Writing. McGraw-Hill International Book Co, New York, 1952.

[19]

N. J. Haneef. Software documentation and readability: a proposed process improvement. SIGSOFT Softw. Eng. Notes, 23(3):75--77, 1998.

Digital Library

[20]

A. E. Hatzimanikatis, C. T. Tsalidis, and D. Christodoulakis. Measuring the readability and maintainability of hyperdocuments. Journal of Software Maintenance, 7(2):77--90, 1995.

Digital Library

[21]

G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. Proceedings of the Second Australia and New Zealand Conference on Intelligent Information Systems, 1994.

[22]

D. Hovemeyer and W. Pugh. Finding bugs is easy. SIGPLAN Not., 39(12):92--106, 2004.

Digital Library

[23]

T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, and J. McMullan. Detection of software modules with high debug code churn in a very large legacy system. In ISSRE '96: Proceedings of the The Seventh International Symposium on Software Reliability Engineering (ISSRE '96), page 364, Washington, DC, USA, 1996. IEEE Computer Society.

Digital Library

[24]

J. P. Kinciad and E. A. Smith. Derivation and validation of the automated readability index for use with technical materials. Human Factors, 12:457--464, 1970.

[25]

P. Knab, M. Pinzger, and A. Bernstein. Predicting defect densities in source code files with decision tree learners. In MSR '06: Proceedings of the 2006 international workshop on Mining software repositories, pages 119--125, 2006.

Digital Library

[26]

J. C. Knight and E. A. Myers. Phased inspections and their implementation. SIGSOFT Softw. Eng. Notes, 16(3):29--35, 1991.

Digital Library

[27]

R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence, 14(2):1137--1145, 1995.

Digital Library

[28]

R. Likert. A technique for the measurement of attitudes. Archives of Psychology, 140:44{53, 1932.

[29]

J. Lionel E. Deimel. The uses of program reading. SIGCSE Bull., 17(2):5--14, 1985.

Digital Library

[30]

S. MacHaffie, R. McLeod, B. Roberts, P. Todd, and L. Anderson. A readability metric for computer-generated mathematics. Technical report, Saltire Software, http://www.saltire.com/equation.html, retrieved 2007.

[31]

G. H. McLaughlin. Smog grading -- a new readability. Journal of Reading, May 1969.

[32]

R. J. Miara, J. A. Musselman, J. A. Navarro, and B. Shneiderman. Program indentation and comprehensibility. Commun. ACM, 26(11):861--867, 1983.

Digital Library

[33]

T. Mitchell. Machine Learning. McGraw Hill, 1997.

Digital Library

[34]

N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In ICSE'05: Proceedings of the 27th international conference on Software engineering, pages 284--292, 2005.

Digital Library

[35]

C. V. Ramamoorthy and W.-T. Tsai. Advances in software engineering. Computer, 29(10):47--58, 1996.

Digital Library

[36]

D. R. Raymond. Reading source code. In CASCON'91: Proceedings of the 1991 conference of the Centre for Advanced Studies on Collaborative research, pages 3--16. IBM Press, 1991.

Digital Library

[37]

P. A. Relf. Tool assisted identifier naming for improved software readability: an empirical study. Empirical Software Engineering, 2005. 2005 International Symposium on, November 2005.

[38]

S. Rugaber. The use of domain knowledge in program understanding. Ann. Softw. Eng., 9(1-4):143--192, 2000.

Digital Library

[39]

C. Simonyi. Hungarian notation. MSDN Library, November 1999.

[40]

S. E. Stemler. A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Practical Assessment, Research and Evaluation, 9(4), 2004.

[41]

H. Sutter and A. Alexandrescu. C++ Coding Standards: 101 Rules, Guidelines, and Best Practices. Addison-Wesley Professional, 2004.

Digital Library

[42]

T. Tenny. Program readability: Procedures versus comments. IEEE Trans. Softw. Eng., 14(9):1271--1279, 1988.

Digital Library

[43]

A. Watters, G. van Rossum, and J. C. Ahlstrom. Internet Programming with Python. MIS Press/Henry Holt publishers, New York, 1996.

Digital Library

[44]

E. J. Weyuker. Evaluating software complexity measures. IEEE Trans. Softw. Eng., 14(9):1357--1365, 1988.

Digital Library

Cited By

Sergeyuk ALvova OTitov SSerova ABagirov FKirillova EBryksin TBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Reassessing Java Code Readability Models with a Human-Centered ApproachProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644435(225-235)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644435
Vijayvargiya SSaad MSharma T(2024)Enhancing Identifier Naming Through Multi-Mask Fine-Tuning of Language Models of Code2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM63643.2024.00017(71-82)Online publication date: 7-Oct-2024
https://doi.org/10.1109/SCAM63643.2024.00017
Sampaio ISampaio A(2024)Replication of a Study about the Impact of Method Chaining and Comments on Readability and Comprehension2024 4th International Conference on Code Quality (ICCQ)10.1109/ICCQ60895.2024.10576941(35-52)Online publication date: 22-Jun-2024
https://doi.org/10.1109/ICCQ60895.2024.10576941
Show More Cited By

Index Terms

A metric for software readability
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. System management
        Quality assurance

Recommendations

Learning a Metric for Code Readability

In this paper, we explore the concept of code readability and investigate its relation to software quality. With data collected from 120 human annotators, we derive associations between a simple set of local code features and human notions of ...
New internal metric for software clustering algorithms validity

Clustering (modularisation) techniques are often employed for the meaningful decomposition of a program aiming to understand it. In the software clustering context, several external metrics are presented to evaluate and validate the resultant clustering ...
Reverse engineering: a roadmap
ICSE '00: Proceedings of the Conference on The Future of Software Engineering

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA '08: Proceedings of the 2008 international symposium on Software testing and analysis

July 2008

324 pages

ISBN:9781605580500

DOI:10.1145/1390630

General Chair:
Barbara G. Ryder
Virginia Tech, USA
,
Program Chair:
Andreas Zeller
Saarland University, Germany

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '08

Sponsor:

ISSTA '08: International Symposium on Software Testing and Analysis

July 20 - 24, 2008

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

115
Total Citations
View Citations
2,497
Total Downloads

Downloads (Last 12 months)129
Downloads (Last 6 weeks)18

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sergeyuk ALvova OTitov SSerova ABagirov FKirillova EBryksin TBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Reassessing Java Code Readability Models with a Human-Centered ApproachProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644435(225-235)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644435
Vijayvargiya SSaad MSharma T(2024)Enhancing Identifier Naming Through Multi-Mask Fine-Tuning of Language Models of Code2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM63643.2024.00017(71-82)Online publication date: 7-Oct-2024
https://doi.org/10.1109/SCAM63643.2024.00017
Sampaio ISampaio A(2024)Replication of a Study about the Impact of Method Chaining and Comments on Readability and Comprehension2024 4th International Conference on Code Quality (ICCQ)10.1109/ICCQ60895.2024.10576941(35-52)Online publication date: 22-Jun-2024
https://doi.org/10.1109/ICCQ60895.2024.10576941
Mondal SRoy B(2024)Reproducibility of issues reported in stack overflow questions: Challenges, impact & estimationJournal of Systems and Software10.1016/j.jss.2024.112158217(112158)Online publication date: Nov-2024
https://doi.org/10.1016/j.jss.2024.112158
Park KJohnson JPeterson CYedla NBaysinger IAponte JSharif B(2024)An eye tracking study assessing source code readability rules for program comprehensionEmpirical Software Engineering10.1007/s10664-024-10532-x29:6Online publication date: 5-Oct-2024
https://doi.org/10.1007/s10664-024-10532-x
Winkler DUrbanke PRamler R(2024)Investigating the readability of test codeEmpirical Software Engineering10.1007/s10664-023-10390-z29:2Online publication date: 26-Feb-2024
https://doi.org/10.1007/s10664-023-10390-z
Takahashi K(2024)An Interactive Tool to Improve Program Readability for Novice StudentsNew Technology in Education and Training10.1007/978-981-97-3883-0_13(145-157)Online publication date: 15-Aug-2024
https://doi.org/10.1007/978-981-97-3883-0_13
Wang BKolluri ANikolić IBaluta TSaxena P(2023)User-Customizable Transpilation of Scripting LanguagesProceedings of the ACM on Programming Languages10.1145/35860347:OOPSLA1(201-229)Online publication date: 6-Apr-2023
https://dl.acm.org/doi/10.1145/3586034
Fang DLiu SLi Y(2023)Cross-Project Transfer Learning on Lightweight Code Semantic Graphs for Defect PredictionInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402350026233:07(1095-1117)Online publication date: 6-Jul-2023
https://doi.org/10.1142/S0218194023500262
Larsen SFalleri JBaudry BMonperrus M(2023)Spork: Structured Merge for Java With Formatting PreservationIEEE Transactions on Software Engineering10.1109/TSE.2022.314376649:1(64-83)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TSE.2022.3143766
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents