skip to main content
article
Free Access

Truth in SPEC benchmarks

Published:15 December 1995Publication History
Skip Abstract Section

Abstract

The System Performance Evaluation Cooperative (SPEC) benchmarks are a set of integer and floating-point programs that are intended to be “effective and fair in comparing the performance of high performance computing systems”. SPEC ratings are often quoted in company advertising and have been trusted as the de facto measure of comparison for computer systems. Recently, there has been some concern regarding the fairness and the value of these benchmarks for comparing computer systems.

In this paper we investigate the following two questions regarding the SPEC92 benchmark suite: 1) How sensitive are the SPEC ratings to various tunings? 2) How reproducible are the published results? For six vendors, we compare the published SPECpeak and SPECbase ratings, and observe an 11% average improvement in the SPECpeak ratings due to changes in the compiler flags alone. In our own attempt to reproduce the published SPEC ratings, we came across various “explicit” and “hidden” tuning parameters that we consider unrealistic. We suggest a new unit called SPECsimple that requires using only the -O compiler optimization flag, shared libraries, and standard system configuration. SPECsimple is designed to better match the performance experienced by a typical user. Our measured SPECsimples are 65-86% of the advertised SPECpeak performance. We conclude this paper by citing cases compiler optimizations specifically designed for SPEC programs, in which performance decreases drastically or the computed results are incorrect if the compiled program does not exactly match the SPEC benchmark program. These findings show that the fairness and value of the popular SPEC benchmarks are questionable.

References

  1. {1} Chan, Y., Sudarsanam, A., and Wolfe A. "The Effect of Compiler-Flag Tuning on SPEC Benchmark Performance," Computer Architecture News, Vol. 22, No. 4, pp. 60-70, September 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {2} Giladi, R., Ahituv, N. "SPEC as a Performance Evaluation Measure," Computer, Vol. 28, No. 8, pp. 33-42, August 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {3} Glaeser, Christopher. Nullstone Corporation. E-mail communication. April 1995.Google ScholarGoogle Scholar
  4. {4} Reilly, Jeff. SPEC CPU Subcommittee Chair. Netnews posting. October 1994.Google ScholarGoogle Scholar
  5. {5} Rozhin, Mark. HP SPEC representative. E-mail communication. November 1994.Google ScholarGoogle Scholar
  6. {6} SPEC Newsletter, 6:3, September 1994.Google ScholarGoogle Scholar
  7. {7} SPEC Newsletter, 6:2, June 1994.Google ScholarGoogle Scholar
  8. {8} SPEC Newsletter, 6:1, March 1994.Google ScholarGoogle Scholar
  9. {9} SPEC Newsletter, 5:2, June 1993.Google ScholarGoogle Scholar
  10. {10} SPEC Open Systems Steering Committee Policy Document, January 1994.Google ScholarGoogle Scholar
  11. {11} Trent, Eugene. SGI SPEC representative. Phone communication. November 1994.Google ScholarGoogle Scholar
  12. {12} Vitale, Phil. HP SPEC representative. E-mail communications. November 1994 and July 1995.Google ScholarGoogle Scholar

Index Terms

  1. Truth in SPEC benchmarks

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader