skip to main content
research-article

STADS: Software Testing as Species Discovery

Published:27 June 2018Publication History
Skip Abstract Section

Abstract

A fundamental challenge of software testing is the statistically well-grounded extrapolation from program behaviors observed during testing. For instance, a security researcher who has run the fuzzer for a week has currently no means (1) to estimate the total number of feasible program branches, given that only a fraction has been covered so far; (2) to estimate the additional time required to cover 10% more branches (or to estimate the coverage achieved in one more day, respectively); or (3) to assess the residual risk that a vulnerability exists when no vulnerability has been discovered. Failing to discover a vulnerability does not mean that none exists—even if the fuzzer was run for a week (or a year). Hence, testing provides no formal correctness guarantees.

In this article, I establish an unexpected connection with the otherwise unrelated scientific field of ecology and introduce a statistical framework that models Software Testing and Analysis as Discovery of Species (STADS). For instance, in order to study the species diversity of arthropods in a tropical rain forest, ecologists would first sample a large number of individuals from that forest, determine their species, and extrapolate from the properties observed in the sample to properties of the whole forest. The estimations (1) of the total number of species, (2) of the additional sampling effort required to discover 10% more species, or (3) of the probability to discover a new species are classical problems in ecology. The STADS framework draws from over three decades of research in ecological biostatistics to address the fundamental extrapolation challenge for automated test generation. Our preliminary empirical study demonstrates a good estimator performance even for a fuzzer with adaptive sampling bias—AFL, a state-of-the-art vulnerability detection tool. The STADS framework provides statistical correctness guarantees with quantifiable accuracy.

References

  1. Gail E. Austen, Markus Bindemann, Richard A. Griffiths, and David L. Roberts. 2016. Species identification by experts and non-experts: Comparing images from field guides. Nature - Scientific Reports 6 (2016), 1--7.Google ScholarGoogle Scholar
  2. Greg Banks, Marco Cova, Viktoria Felmetsger, Kevin Almeroth, Richard Kemmerer, and Giovanni Vigna. 2006. SNOOZE: Toward a stateful network protocol fuzZEr. In Proceedings of the 9th International Conference on Information Security (ISC’06). 343--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo. 2015. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507--525.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yves Basset, Lukas Cizek, Philippe Cuénoud, Raphael K. Didham, François Guilhaumon, Olivier Missa, Vojtech Novotny, Frode Ødegaard, Tomas Roslin, Jürgen Schmidl, Alexey K. Tishechkin, Neville N. Winchester, David W. Roubik, Henri-Pierre Aberlenc, Johannes Bail, Héctor Barrios, Jon R. Bridle, Gabriela Castaño-Meneses, Bruno Corbara, Gianfranco Curletti, Wesley Duarte da Rocha, Domir De Bakker, Jacques H. C. Delabie, Alain Dejean, Laura L. Fagan, Andreas Floren, Roger L.Kitching, Enrique Medianero, Scott E. Miller, Evandro Gama de Oliveira, Jérôme Orivel, Marc Pollet, Mathieu Rapp, Sérvio P. Ribeiro, Yves Roisin, Jesper B. Schmidt, Line Sørensen, and Maurice Leponce. 2012. Arthropod diversity in a tropical forest. Science 338, 6113 (2012), 1481--1484.Google ScholarGoogle Scholar
  5. A. Bertolino. 2007. Software testing research: Achievements, challenges, dreams. In Future of Software Engineering (FOSE’07). 85--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury. 2013. Partition-based regression verification. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). 302--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury. 2013. Regression tests to expose change interaction errors. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). 334--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Marcel Böhme and Soumya Paul. 2016. A probabilistic analysis of the efficiency of automated software testing. IEEE Transactions on Software Engineering 42, 4 (April 2016), 345--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS’17). 2329--2344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage-based greybox fuzzing as Markov chain. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). 1032--1043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zdravko I. Botev and Dirk P. Kroese. 2008. An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodology and Computing in Applied Probability 10, 4 (2008), 471--505.Google ScholarGoogle ScholarCross RefCross Ref
  12. Ulrich Brose, Neo D. Martinez, and Richard J. Williams. 2003. Estimating species richness: Sensitivity to sample coverage and insensitivity to spatial patterns. Ecology 84, 9 (2003), 2364--2377.Google ScholarGoogle ScholarCross RefCross Ref
  13. J. Bunge and M. Fitzpatrick. 1993. Estimating the number of species: A review. Journal of the American Statistical Association 88, 421 (1993), 364--373.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. P. Burnham and W. S. Overton. 1979. Robust estimation of population size when capture probabilities vary among animals. Ecology 60, 5 (1979), 927--936.Google ScholarGoogle ScholarCross RefCross Ref
  15. Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). 209--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Anne Chao. 1984. Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics 11, 4 (1984), 265--270.Google ScholarGoogle Scholar
  17. Anne Chao. 1987. Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43, 4 (1987), 783--791.Google ScholarGoogle ScholarCross RefCross Ref
  18. Anne Chao and Chun-Huo Chiu. 2014. Species Richness: Estimation and Comparison. John Wiley 8 Sons.Google ScholarGoogle Scholar
  19. Anne Chao and Chun-Huo Chiu. 2016. Nonparametric estimation and comparison of species richness. In Encyclopedia of Life Sciences (eLS).Google ScholarGoogle Scholar
  20. Anne Chao, Chun-Huo Chiu, Robert K. Colwell, Luiz Fernando S. Magnago, Robin L. Chazdon, and Nicholas J. Gotelli. 2017. Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on good-turing theory. Ecology 98, 11 (2017), 2914--2929.Google ScholarGoogle ScholarCross RefCross Ref
  21. Anne Chao and Robert K. Colwell. 2017. Thirty years of progeny from Chao’s inequality: Estimating and comparing richness with incidence data and incomplete sampling. Statistics and Operations Research Transactions 41, 1 (2017), 3--54.Google ScholarGoogle Scholar
  22. Anne Chao, Robert K. Colwell, Chih-Wei Lin, and Nicholas J. Gotelli. 2009. Sufficient sampling for asymptotic minimum species richness estimators. Ecology 90, 4 (2009), 1125--1133.Google ScholarGoogle ScholarCross RefCross Ref
  23. Anne Chao, T. C. Hsieh, Robin L. Chazdon, Robert K. Colwell, and Nicholas J. Gotelli. 2015. Unveiling the species-rank abundance distribution by generalizing the good-turing sample coverage theory. Ecology 96, 5 (2015), 1189--1201.Google ScholarGoogle ScholarCross RefCross Ref
  24. Anne Chao and Lou Jost. 2012. Coverage-based rarefaction and extrapolation: Standardizing samples by completeness rather than size. Ecology 93, 12 (2012), 2533--2547.Google ScholarGoogle ScholarCross RefCross Ref
  25. Anne Chao and Shen-Ming Lee. 1992. Estimating the number of classes via sample coverage. Journal of the American Statistical Association 87, 417 (1992), 210--217.Google ScholarGoogle ScholarCross RefCross Ref
  26. A. Chao, K. H. Ma, T. C. Hsieh, and C. H. Chiu. 2015. Online Program SpadeR (Species-richness Prediction And Diversity Estimation in R). Program and User’s Guide. Retrieved from http://chao.stat.nthu.edu.tw/wordpress/software_download.Google ScholarGoogle Scholar
  27. Anne Chao and Tsung-Jen Shen. 2004. Nonparametric prediction in species sampling. Journal of Agricultural, Biological, and Environmental Statistics 9, 3 (2004), 253--269.Google ScholarGoogle ScholarCross RefCross Ref
  28. R. L. Chazdon, R. K. Colwell, J. S. Denslow, and M. R. Guariguata. 1998. Statistical methods for estimating species richness of woody regeneration in primary and secondary rain forests of Northeastern Costa Rica. In Forest Biodiversity Research, Monitoring and Modeling: Conceptual Background and Old World Case Studies. Vol. 20. Man and the Biosphere Series. 285--309.Google ScholarGoogle Scholar
  29. Thierry Titcheu Chekam, Mike Papadakis, Yves Le Traon, and Mark Harman. 2017. An empirical study on mutation, statement and branch coverage fault revelation that avoids the unreliable clean program assumption. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). 597--608. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tsong Yueh Chen and Yuen-Tak Yu. 1996. On the expected number of failures detected by subdomain testing and random testing. IEEE Transactions on Software Engineering 22, 2 (1996), 109--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: A platform for in-vivo multi-path analysis of software systems. In Proceedings of the 2011 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’11). 265--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. H. Chiu, Y. T. Wang, B. A. Walther, and A. Chao. 2014. An improved nonparametric lower bound of species richness via a modified Good-Turing frequency formula. Biometrics 70, 3 (2014), 671--682.Google ScholarGoogle ScholarCross RefCross Ref
  33. Bernard D. Coleman. 1981. On random placement and species-area relations. Mathematical Biosciences 54, 3 (1981), 191--215.Google ScholarGoogle ScholarCross RefCross Ref
  34. Robert K. Colwell, Anne Chao, Nicholas J. Gotelli, Shang-Yi Lin, Chang Xuan Mao, Robin L. Chazdon, and John T. Longino. 2012. Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages. Journal of Plant Ecology 5, 1 (2012), 3.Google ScholarGoogle ScholarCross RefCross Ref
  35. Robert K. Colwell and Jonathan A. Coddington. 1994. Estimating terrestrial biodiversity through extrapolation. Philosophical Transactions of the Royal Society of London B: Biological Sciences 345, 1311 (1994), 101--118.Google ScholarGoogle Scholar
  36. Robert K. Colwell and Johanna E. Elsensohn. 2014. EstimateS turns 20: Statistical estimation of species richness and shared species from samples, with non-parametric extrapolation. Ecography 37, 6 (2014), 609--613.Google ScholarGoogle ScholarCross RefCross Ref
  37. Robert K. Colwell, Chang Xuan Mao, and Jing Chang. 2004. Interpolating, extrapolating, and comparing incidence-based species accumulation curves. Ecology 85, 10 (2004), 2717--2727.Google ScholarGoogle ScholarCross RefCross Ref
  38. Edsger W. Dijkstra. 1970. Notes on Structured Programming.Google ScholarGoogle Scholar
  39. Joe W. Duran and Simeon C. Ntafos. 1984. An evaluation of random testing. IEEE Transactions on Software Engineering 10, 4 (July 1984), 438--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. M. Durant, T. Wacher, S. Bashir, R. Woodroffe, P. De Ornellas, C. Ransom, J. Newby, T. Abáigar, M. Abdelgadir, H. El Alqamy, J. Baillie, M. Beddiaf, F. Belbachir, A. Belbachir-Bazi, A. A. Berbash, N. E. Bemadjim, R. Beudels-Jamar, L. Boitani, C. Breitenmoser, M. Cano, P. Chardonnet, B. Collen, W. A. Cornforth, F. Cuzin, P. Gerngross, B. Haddane, M. Hadjeloum, A. Jacobson, A. Jebali, F. Lamarque, D. Mallon, K. Minkowski, S. Monfort, B. Ndoassal, B. Niagate, G. Purchase, S. Samala, A. K. Samna, C. Sillero-Zubiri, A. E. Soultan, M. R. Stanley Price, and N. Pettorelli. 2014. Fiddling in biodiversity hotspots while deserts burn? Collapse of the Sahara’s megafauna. Diversity and Distributions 20, 1 (2014), 114--122.Google ScholarGoogle ScholarCross RefCross Ref
  41. Antonio Filieri, Corina S. Păsăreanu, and Willem Visser. 2013. Reliability analysis in symbolic pathfinder. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). 622--631. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. A. Fisher, A. S. Corbet, and C. B. Williams. 1943. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology 12 (1943), 42--58.Google ScholarGoogle ScholarCross RefCross Ref
  43. William A. Gale and Geoffrey Sampson. 1995. Good-Turing smoothing without tears. Journal of Quantitative Linguistics 2 (1995), 217--237.Google ScholarGoogle ScholarCross RefCross Ref
  44. G. Gay, M. Staats, M. Whalen, and M. P. E. Heimdahl. 2015. The risks of coverage-directed test case generation. IEEE Transactions on Software Engineering 41, 8 (Aug. 2015), 803--819.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jaco Geldenhuys, Matthew B. Dwyer, and Willem Visser. 2012. Probabilistic symbolic execution. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (ISSTA’12). 166--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Patrice Godefroid, Adam Kiezun, and Michael Y. Levin. 2008. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). 206--215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). 213--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. I. J. Good. 2000. Turing’s anticipation of empirical Bayes in connection with the cryptanalysis of the naval enigma. Journal of Statistical Computation and Simulation 66, 2 (2000), 101--111.Google ScholarGoogle ScholarCross RefCross Ref
  49. Irving John Good. 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40 (1953), 237--264.Google ScholarGoogle ScholarCross RefCross Ref
  50. I. J. Good and G. H. Toulmin. 1956. The number of new species, and the increase in population coverage, when a sample is increased. Biometrika 43, 1/2 (1956), 45--63.Google ScholarGoogle ScholarCross RefCross Ref
  51. Steven M. Goodman and Jonathan P. Benstead. 2005. Updated estimates of biotic diversity and endemism for Madagascar. Oryx 39, 1 (2005), 73--77.Google ScholarGoogle ScholarCross RefCross Ref
  52. Nicholas J. Gotelli and Anne Chao. 2013. Measuring and estimating species richness, species diversity, and biotic similarity from sampling data. In Encyclopedia of Biodiversity (2nd ed.). Vol. 5. Academic Press.195--211.Google ScholarGoogle Scholar
  53. Walter J. Gutjahr. 1999. Partition testing vs. random testing: The influence of uncertainty. IEEE Transactions on Software Engineering 25, 5 (Sept. 1999), 661--674. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Andrew J. Hamilton, Yves Basset, Kurt K. Benke, Peter S. Grimbacher, Scott E. Miller, Vojtech Novotnỳ, G. Allan Samuelson, Nigel E. Stork, George D. Weiblen, and Jian D.L. Yen. 2010. Quantifying uncertainty in estimation of tropical arthropod species richness. American Naturalist 176, 1 (2010), 90--95.Google ScholarGoogle ScholarCross RefCross Ref
  55. Andrew J. Hamilton, Yves Basset, Kurt K. Benke, Peter S. Grimbacher, Scott E. Miller, Vojtech Novotnỳ, G. Allan Samuelson, Nigel E. Stork, George D. Weiblen, and Jian D. L. Yen. 2011. Correction. American Naturalist 177, 4 (2011), 544--545.Google ScholarGoogle ScholarCross RefCross Ref
  56. D. Hamlet and R. Taylor. 1990. Partition testing does not inspire confidence {program testing}. IEEE Transactions on Software Engineering 16, 12 (Dec. 1990), 1402--1411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Dick Hamlet and Jeff Voas. 1993. Faults on its sleeve: Amplifying software reliability testing. In Proceedings of the 1993 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’93). 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Mary Jean Harrold. 2000. Testing: A roadmap. In Proceedings of the Conference on the Future of Software Engineering (ICSE’00). 61--72.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Frank J. Hernandez and David G. Lindquist. 1999. A comparison of two light-trap designs for sampling larval and presettlement juvenile fish above a reef in Onslow Bay, North Carolina. Bulletin of Marine Science 64, 1 (1999), 173--184.Google ScholarGoogle Scholar
  60. Joaquin Hortal, Paulo A. V. Borges, and Clara Gaspar. 2006. Evaluating the performance of species richness estimators: Sensitivity to sample grain size. Journal of Animal Ecology 75, 1 (2006), 274--287.Google ScholarGoogle ScholarCross RefCross Ref
  61. Matthias Höschele and Andreas Zeller. 2016. Mining input grammars from dynamic taints. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). 720--725. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. T. C. Hsieh, K. H. Ma, and Anne Chao. 2016. iNEXT: An R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods in Ecology and Evolution 7, 12 (2016), 1451--1456.Google ScholarGoogle ScholarCross RefCross Ref
  63. Stuart H. Hurlbert. 1971. The nonconcept of species diversity: A critique and alternative parameters. Ecology 52, 4 (1971), 577--586.Google ScholarGoogle ScholarCross RefCross Ref
  64. Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). 435--445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. E. T. Jaynes. 2003. Probability Theory: The Logic of Science. Cambridge University Press.Google ScholarGoogle ScholarCross RefCross Ref
  66. Y. Jia and M. Harman. 2011. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering 37, 5 (Sept. 2011), 649--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Shen-Ming Lee and Anne Chao. 1994. Estimating population size via sample coverage for closed capture-recapture models. Biometrics 50, 1 (1994), 88--97.Google ScholarGoogle ScholarCross RefCross Ref
  68. B. Littlewood and D. Wright. 1997. Some conservative stopping rules for the operational testing of safety critical software. IEEE Transactions on Software Engineering 23, 11 (Nov. 1997), 673--683. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. John T. Longino and Robert K. Colwell. 1997. Biodiversity assessment using structured inventory: Capturing the ant fauna of a tropical rain forest. Ecological Applications 7, 4 (1997), 1263--1277.Google ScholarGoogle ScholarCross RefCross Ref
  70. Anne E. Magurran and Brian J. McGill. 2011. Biological Diversity: Frontiers in Measurement and Assessment. Oxford University Press.Google ScholarGoogle Scholar
  71. Chang Xuan Mao and Robert K. Colwell. 2005. Estimation of species richness: Mixture models, the role of rare species, and inferential challenges. Ecology 86, 5 (2005), 1143--1153.Google ScholarGoogle ScholarCross RefCross Ref
  72. Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for android applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). 94--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Björn Matthis, Vitalii Avdiienko, Ezekiel Soremekun, Marcel Böhme, and Andreas Zeller. 2017. Detecting information flow by mutating input data. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  74. Brian A. Maurer, James H. Brown, and Renee D. Rusler. 1992. The micro and macro in body size evolution. Evolution 46, 4 (1992), 939--953.Google ScholarGoogle ScholarCross RefCross Ref
  75. Phil McMinn. 2004. Search-based software test data generation: A survey: Research articles. Journal of Software Testing, Verification and Reliability 14, 2 (June 2004), 105--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. P. McMinn. 2011. Search-based software testing: Past, present and future. In Proceedings of the 4th IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW’11). 153--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Barton P. Miller, Louis Fredriksen, and Bryan So. 1990. An empirical study of the reliability of UNIX utilities. Communications of the ACM 33, 12 (Dec. 1990), 32--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. K. W. Miller, L. J. Morell, R. E. Noonan, S. K. Park, D. M. Nicol, B. W. Murrill, and M. Voas. 1992. Estimating the probability of failure when testing reveals no failures. IEEE Transactions on Software Engineering 18, 1 (Jan. 1992), 33--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Camilo Mora, Derek P. Tittensor, Sina Adl, Alastair G. B. Simpson, and Boris Worm. 2011. How many species are there on earth and in the ocean? PLOS Biology 9, 8 (2011), 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  80. Mesrob I. Ohannessian. 2012. On Inference about Rare Events. PhD dissertation. Massachusetts Institute of Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Alon Orlitsky and Ananda Theertha Suresh. 2015. Competitive distribution estimation: Why is Good-Turing good. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15). 2143--2151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Alon Orlitsky, Ananda Theertha Suresh, and Yihong Wu. 2016. Optimal prediction of the number of unseen species. Proceedings of the National Academy of Sciences 113, 47 (2016), 13283--13288.Google ScholarGoogle ScholarCross RefCross Ref
  83. Michael W. Palmer. 1991. Estimating species richness: The second-order jackknife reconsidered. Ecology 72, 4 (1991), 1512--1513.Google ScholarGoogle ScholarCross RefCross Ref
  84. Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2016. Model-based whitebox fuzzing for program binaries. In Proceedings of the 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). 543--553. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. E. C. Pielou. 1966. Species-diversity and pattern-diversity in the study of ecological succession. Journal of Theoretical Biology 10, 2 (1966), 370--383.Google ScholarGoogle ScholarCross RefCross Ref
  86. F. W. Preston. 1948. The commonness, and rarity, of species. Ecology 29, 3 (1948), 254--283.Google ScholarGoogle ScholarCross RefCross Ref
  87. Dawei Qi, Hoang D. T. Nguyen, and Abhik Roychoudhury. 2013. Path exploration based on symbolic output. ACM Transactions on Software Engineering and Methodology 22, 4 (Oct. 2013), 32:1--32:41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Sushma Reddy and Liliana M. Dávalos. 2003. Geographical sampling bias and its implications for conservation priorities in Africa. Journal of Biogeography 30, 11 (2003), 1719--1727.Google ScholarGoogle ScholarCross RefCross Ref
  89. Herbert E. Robbins. 1968. Estimating the total probability of the unobserved outcomes of an experiment. Annals of Mathematical Statistics 39, 1 (1968), 256--257.Google ScholarGoogle ScholarCross RefCross Ref
  90. Mark D. Robinson, Davis J. McCarthy, and Gordon K. Smyth. 2010. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 1 (2010), 139--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A fast address sanity checker. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC’12). 28--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Tsung-Jen Shen, Anne Chao, and Chih-Feng Lin. 2003. Predicting the number of new species in further taxonomic sampling. Ecology 84, 3 (2003), 798--804.Google ScholarGoogle ScholarCross RefCross Ref
  93. Andrew R. Solow and Stephen Polasky. 1999. A quick estimator for taxonomic surveys. Ecology 80, 8 (1999), 2799--2803.Google ScholarGoogle ScholarCross RefCross Ref
  94. Evgeniy Stepanov and Konstantin Serebryany. 2015. MemorySanitizer: Fast detector of uninitialized memory use in C++. In Proceedings of the 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO’15). 46--55. Google ScholarGoogle ScholarCross RefCross Ref
  95. A. B. Wagner, P. Viswanath, and S. R. Kulkarni. 2006. Strong consistency of the good-turing estimator. In Proceedings of the 2006 IEEE International Symposium on Information Theory. 2526--2530.Google ScholarGoogle Scholar
  96. Bruno A. Walther and Joslin L. Moore. 2005. The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography 28, 6 (2005), 815--829.Google ScholarGoogle ScholarCross RefCross Ref
  97. Bruno A. Walther and Joslin L. Moore. 2005. The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography 28, 6 (2005), 815--829.Google ScholarGoogle ScholarCross RefCross Ref
  98. Ponemon Institute. 2017. Ponemon Cost of Cyber Crime Study. Retrieved from https://www.accenture.com/us-en/insight-cost-of-cybercrime-2017. Accessed 11-13-2017.Google ScholarGoogle Scholar
  99. IEEE Computer Society. 2013. Lockheed Martin Webinar Series: Michael Whalen on the Future of Verification and Validation. Retrieved from https://www.computer.org/cms/Computer.org/webinars/lmco/012413Slides-Whalen.pdf. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  100. Michał Zalewski. 2017. AFL: American Fuzzy Lop Fuzzer. Retrieved from http://lcamtuf.coredump.cx/afl/technical_details.txt. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  101. Michał Zalewski. 2017. AFL: Pulling Jpegs Out of Thin Air, Michael Zalewski. Retrieved from https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  102. CERT Division. 2017. CERT Triage Tools. Retrieved from https://www.cert.org/vulnerability-analysis/tools/triage.cfm?. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  103. The LLVM Compiler Infrastructure Project. 2017. Clang Compiler Documentation. Retrieved from https://clang.llvm.org/docs/index.html. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  104. US Defense Advanced Research Projects Agency. 2017. DARPA Cyber Grand Challenge. Retrieved from http://www.darpa.mil/news-events/2016-08-04. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  105. Facebook. 2017. Facebook: Mark Harman on Software Engineering at Facebook Scale. Retrieved from https://research.fb.com/mark-harmon-on-software-engineering-at-facebook-scale/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  106. FFMPEG. 2017. FFMPEG: A Complete, Cross-Platform Solution to Record, Convert and Stream Audio and Video. Retrieved from https://www.ffmpeg.org/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  107. GNU. 2017. GCov: Coverage Testing Tool. Retrieved from https://linux.die.net/man/1/gcov. Accessed: 11-13-2017.Google ScholarGoogle Scholar
  108. GNU. 2017. GNU GCC Sanitizer Options. Retrieved from https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gcc/Instrumentation-Options.html#index-fsanitize_003daddress-947. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  109. Anne Chao, K. H. Ma, and T. C. Hsieh. 2017. iNext Online: Species iNterpolation and EXTrapolation. Retrieved from https://chao.shinyapps.io/iNEXTOnline/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  110. Niels Lohmann. 2017. JSON for Modern C++. Retrieved from https://github.com/nlohmann/json. Accessed: 11-13-2017.Google ScholarGoogle Scholar
  111. The LLVM Compiler Infrastructure Project. 2017. LibFuzzer: A Library for Coverage-Guided Fuzz Testing. Retrieved from http://llvm.org/docs/LibFuzzer.html. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  112. The Libjpeg-turbo Project. 2017. libjpeg-turbo Is a JPEG Image Codec to Accelerate Baseline JPEG Compression and Decompression. Retrieved from http://libjpeg-turbo.virtualgl.org/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  113. The GNOME Project. 2017. LibXML2: The XML C Parser and Toolkit of Gnome. Retrieved from http://xmlsoft.org/. Accessed: 11-13-2017.Google ScholarGoogle Scholar
  114. Microsoft. 2017. Microsoft: Project Springfield. Retrieved from https://www.microsoft.com/Springfield/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  115. Google. 2017. Monkey: Android Random Testing. Retrieved from http://developer.android.com/tools/help/monkey.html. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  116. Mozilla. 2017. Mozilla: Fuzzing Firefox with Peach.Retrieved from https://wiki.mozilla.org/Security/Fuzzing/Peach. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  117. The OpenSSL Project. 2017. OpenSSL: A Toolkit for the Transport Layer Security (TLS) and Secure Sockets Layer (SSL) Protocols. Retrieved from https://www.openssl.org/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  118. Google. 2017. OSS-Fuzz: Five Months Later. Retrieved from https://testing.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  119. Peach Fuzzer. 2017. Peach Fuzzer Platform. Retieved from http://www.peachfuzzer.com/products/peach-platform/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  120. Anne Chao, K. H. Ma, T. C. Hsieh, and Chun-Huo Chiu. 2017. SpadeR Online: Species-Richness Prediction and Diversity Estimation in R. Retrieved from https://chao.shinyapps.io/SpadeR/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  121. Eric Archer. 2017. sprex: Calculate Species Richness and Extrapolation Metrics. Retrieved from https://cran.r-project.org/web/packages/sprex/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  122. Google. 2017. Syzkaller: Coverage-Guided Kernel Fuzzing. Retrieved from https://github.com/google/syzkaller. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  123. The Wireshark team. 2017. Wireshark Is the World’s Foremost and Widely-Used Network Protocol Analyzer. Retrieved from https://www.wireshark.org/. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  124. Sean Eron Anderson. 2017. Bithacks: Implementing Logarithm Efficiently. Retrieved from https://graphics.stanford.edu/ seander/bithacks.html#IntegerLogFloat. Accessed: 11-13-2017.Google ScholarGoogle Scholar
  125. Cloudflare. 2017. Incident Report on Memory Leak Caused by Cloudflare Parser Bug. Retrieved from https://blog.cloudflare.com/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/. Accessed: 11-13-2017.Google ScholarGoogle Scholar
  126. Haseeb Qureshi. 2017. Medium: A Hacker Stole USD 31M of Ether. Retrieved from https://medium.freecodecamp.org/a-hacker-stole-31m-of-ether-how-it-happened-and-what-it-means-for-ethereum-9e5dc29e33ce. Accessed: 05-13-2017.Google ScholarGoogle Scholar
  127. Damien Gayle, Alexandra Topping, Ian Sample, Sarah Marsh, and Vikram Dodd. 2017. NHS Seeks to Recover from Global Cyber-Attack as Security Concerns Resurface. Retrieved from https://www.theguardian.com/society/2017/may/12/hospitals-across-england-hit-by-large-scale-cyber-attack. Accessed: 11-13-2017.Google ScholarGoogle Scholar
  128. Elaine J. Weyuker. 1982. On testing non-testable programs. Computer Journal 25, 4 (1982), 465--470.Google ScholarGoogle ScholarCross RefCross Ref
  129. E. J. Weyuker and B. Jeng. 1991. Analyzing partition testing strategies. IEEE Transactions on Software Engineering 17, 7 (July 1991), 703--711. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. B. Yang and M. Xie. 2000. A study of operational and testing reliability in software reliability analysis. Reliability Engineering 8 System Safety 70, 3 (2000), 323--329.Google ScholarGoogle Scholar
  131. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). 283--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. Cun-Hui Zhang and Zhiyi Zhang. 2009. Asymptotic normality of a nonparametric estimator of sample coverage. Annals of Statistics 37, 5A (10 2009), 2582--2595.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. STADS: Software Testing as Species Discovery

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Software Engineering and Methodology
        ACM Transactions on Software Engineering and Methodology  Volume 27, Issue 2
        April 2018
        186 pages
        ISSN:1049-331X
        EISSN:1557-7392
        DOI:10.1145/3234930
        Issue’s Table of Contents

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 June 2018
        • Revised: 1 April 2018
        • Accepted: 1 April 2018
        • Received: 1 August 2017
        Published in tosem Volume 27, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader