BenchPrime: Effective Building of a Hybrid Benchmark Suite

Authors:
Qingrui Liu

Virginia Tech, Kraft Drive,Blacksburg, VA

Virginia Tech, Kraft Drive,Blacksburg, VA
View Profile

,
Xiaolong Wu

Virginia Tech, Kraft Drive,Blacksburg, VA

Virginia Tech, Kraft Drive,Blacksburg, VA
View Profile

,
Larry Kittinger

Virginia Tech, Kraft Drive,Blacksburg, VA

Virginia Tech, Kraft Drive,Blacksburg, VA
View Profile

,
Markus Levy

EEMBC

EEMBC
View Profile

,
Changhee Jung

Virginia Tech, Kraft Drive,Blacksburg, VA

Virginia Tech, Kraft Drive,Blacksburg, VA
View Profile

Authors Info & Claims

ACM Transactions on Embedded Computing Systems Volume 16 Issue 5sArticle No.: 179pp 1–22https://doi.org/10.1145/3126499

Published:27 September 2017Publication History

ACM Transactions on Embedded Computing Systems

Abstract

This paper presents BenchPrime, an automated benchmark analysis toolset that is systematic and extensible to analyze the similarity and diversity of benchmark suites. BenchPrime takes multiple benchmark suites and their evaluation metrics as inputs and generates a hybrid benchmark suite comprising only essential applications. Unlike prior work, BenchPrime uses linear discriminant analysis rather than principal component analysis, as well as selects the best clustering algorithm and the optimized number of clusters in an automated and metric-tailored way, thereby achieving high accuracy. In addition, BenchPrime ranks the benchmark suites in terms of their application set diversity and estimates how unique each benchmark suite is compared to other suites.

As a case study, this work for the first time compares the DenBench with the MediaBench and MiBench using four different metrics to provide a multi-dimensional understanding of the benchmark suites. For each metric, BenchPrime measures to what degree DenBench applications are irreplaceable with those in MediaBench and MiBench. This provides means for identifying an essential subset from the three benchmark suites without compromising the application balance of the full set. The experimental results show that the necessity of including DenBench applications varies across the target metrics and that significant redundancy exists among the three benchmark suites.

References

Vignesh Adhinarayanan and Wu-chun Feng. 2015. An Automated Framework for Characterizing and Subsetting GPGPU Workloads. (2015).Google Scholar
Robert Adolf, Saketh Rama, Brandon Reagen, Gu-Yeon Wei, and David Brooks. 2016. Fathom: Reference Workloads for Modern Deep Learning Methods. In Workload Characterization (IISWC), 2016 IEEE International Symposium on. IEEE, 1--10.Google ScholarCross Ref
Charu C. Aggarwal and Chandan K. Reddy (Eds.). 2014. Data Clustering: Algorithms and Applications. CRC Press. Google ScholarCross Ref
Avinash C. Kak. Aleix M. Martinez. 2001. PCA versus LDA. In IEEE Transaction on Pattern Analysis and Machine Intelligence. IEEE Computer Society, 228--233. Google ScholarDigital Library
L. Barroso and U. Holzle. 2007. The Case for Energy-proportional Computing. 40, 12 (2007), 33--37. Google ScholarDigital Library
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (2011), 1--7. Google ScholarDigital Library
C.M. Bishop and et al. 2006. Pattern Recognition and Machine Learning. Springer New York. Google ScholarDigital Library
Garo Bournoutian and Alex Orailoglu. 2009. Reducing Impact of Cache Miss Stalls in Embedded Systems by Extracting Guaranteed Independent Instructions. In Proceedings of the 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’09). ACM, New York, NY, USA, 117--126. Google ScholarDigital Library
Guy Brock, Vasyl Pihur, Susmita Datta, and Somnath Datta. 2008. clValid: An R Package for Cluster Validation. Journal of Statistical Software 25, 1 (2008), 1--22.Google ScholarCross Ref
D. Brooks, Vivek Tiwari, and Margaret Martonosi. 2000. A Framework for Architectural-level Power Analysis and Optimizations. 83--94. Google ScholarDigital Library
Songah Chae, Doo-Hyun Kim, Changhee Jung, Duk-Kyun Woo, and Chaedeok Lim. 2007. Experimental Analysis on Time-triggered Power Consumption Measurement with DVS-enabled Multiple Power Domain Platform. In Software Technologies for Embedded and Ubiquitous Systems, 5th IFIP WG 10.2 International Workshop, SEUS 2007, Santorini Island, Greece, May 2007. Revised Papers. 149--158. Google ScholarDigital Library
Moslem Didehban, Dheeraj Lokam, and Aviral Shrivastava. 2017. An Integrated Safe and Fast Recovery Scheme from Soft Errors. In Proceedings of The 54th Annual Design Automation Conference (DAC).Google Scholar
Moslem Didehban and Aviral Shrivastava. 2016. NZDC: A Compiler Technique for Near Zero Silent Data Corruption. In Proceedings of The 53rd Annual Design Automation Conference (DAC). Google ScholarDigital Library
Moslem Didehban, Aviral Shrivastava, and Dheeraj Lokam. 2017. NEMESIS: A Software Approach for Computing in Presence of Soft Errors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD).Google ScholarCross Ref
Lieven Eeckhout, Hans Vandierendonck, and Koen Bosschere. 2002. Workload Design: Selecting Representative Program-input Pairs. In Parallel Architectures and Compilation Techniques, 2002. Proceedings. 2002 International Conference on. IEEE. Google ScholarDigital Library
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop on. 3--14. Google ScholarDigital Library
Kenneth Hoste and Lieven Eeckhout. 2006. Comparing Benchmarks using Key Microarchitecture-independent Characteristics. In Workload Characterization, 2006 IEEE International Symposium on. IEEE, 83--92.Google ScholarCross Ref
Kenneth Hoste, Aashish Phansalkar, Lieven Eeckhout, Andy Georges, Lizy K John, and Koen De Bosschere. 2006. Performance Prediction based on Inherent Program Similarity. In Parallel Architectures and Compilation Techniques (PACT), 2006 International Conference on. IEEE, 114--122. Google ScholarDigital Library
Ciji Isen, Lizy John, Jung Pil Choi, and Hyo Jung Song. 2008. On the Representativeness of Embedded Java Benchmarks. In Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on. IEEE, 153--162.Google ScholarCross Ref
Adam N. Jacobvitz, Andrew D Hilton, and Daniel J Sorin. 2015. Multi-program Benchmark Definition. In Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on. IEEE, 72--82.Google ScholarCross Ref
Reiley Jeyapaul, Abhishek Risheekesan, Aviral Shrivastava, and Kyoungwoo Lee. 2014. UnSync-CMP: Multicore CMP Architecture for Energy Efficient Soft Error Reliability. Transactions on Parallel and Distributed Systems 25, 1 (January 2014), 254--263. Google ScholarDigital Library
Zhen Jia, Jianfeng Zhan, Lei Wang, Rui Han, Sally A. McKee, Qiang Yang, Chunjie Luo, and Jingwei Li. 2014. Characterizing and Subsetting Big Data Workloads. CoRR abs/1409.0792 (2014). http://arxiv.org/abs/1409.0792.Google Scholar
Ajay Joshi, Aashish Phansalkar, Lieven Eeckhout, and Lizy Kurian John. 2006. Measuring Benchmark Similarity using Inherent Program Characteristics. IEEE Trans. Comput. 55, 6 (2006), 769--782. Google ScholarDigital Library
Changhee Jung. 2013. Effective techniques for understanding and improving data structure usage. Ph.D. Dissertation. Georgia Institute of Technology, Atlanta, GA, USA. http://hdl.handle.net/1853/49101.Google Scholar
Changhee Jung and Nathan Clark. 2009. DDT: Design and Evaluation of a Dynamic Program Analysis for Optimizing Data Structure Usage. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42nd). Google ScholarDigital Library
Changhee Jung and Nathan Clark. 2009. Toward Automatic Data Structure Replacement for Effective Parallelization. In Proc. of the Workshop on Parallel Execution of Sequential Programs on Multicore Architectures. 2--11.Google Scholar
Changhee Jung, Sangho Lee, Easwaran Raman, and Santosh Pande. 2014. Automated Memory Leak Detection for Production Use. In Proceedings of the 36th International Conference on Software Engineering. Google ScholarDigital Library
Changhee Jung, Daeseob Lim, Jaejin Lee, and SangYong Han. 2005. Adaptive Execution Techniques for SMT Multiprocessor Architectures. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 236--246. Google ScholarDigital Library
Changhee Jung, Daeseob Lim, Jaejin Lee, and Yan Solihin. 2006. Helper Thread Prefetching for Loosely-coupled Multiprocessor Systems. In Proceedings of the 20th International Conference on Parallel and Distributed Processing (IPDPS’06). IEEE Computer Society, Washington, DC, USA, 140--140. http://dl.acm.org/citation.cfm?id=1898953.1899071. Google ScholarDigital Library
Changhee Jung, Silvius Rus, Brian P. Railing, Nathan Clark, and Santosh Pande. 2011. Brainy: Effective Selection of Data Structures. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, USA, 86--97. Google ScholarDigital Library
Changhee Jung, Duk-Kyun Woo, Kanghee Kim, and Sung-Soo Lim. 2007. Performance Characterization of Prelinking and Preloading for Embedded Systems. In Proc. of the 7th ACM 8 IEEE EMSOFT. New York, NY, USA. Google ScholarDigital Library
Chang Hee Jung, Dae Seob Lim, Jae Jin Lee, and Sang Yong Han. 2009. Adaptive Execution Method for Multithreaded Processor-based Parallel System. (April 28 2009). US Patent 7,526,637.Google Scholar
Soontae Kim, N. Vijaykrishnan, Mahmut Kandemir, Anand Sivasubrmaniam, and M. J. Irwin. 2003. Partitioned Instruction Cache Architecture for Energy Efficiency. 2, 2 (March 2003), 163--165.Google Scholar
Yohan Ko, Reiley Jeyapaul, Youngbin Kim, Kyoungwoo Lee, and Aviral Shrivastava. 2015. Guidelines to Design Parity Protected Write-back L1 Data Cache. In Proceedings of The 52nd Annual Design Automation Conference (DAC). Google ScholarDigital Library
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis 8 Transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO’04). IEEE Computer Society, Washington, DC, USA, 75--. http://dl.acm.org/citation.cfm?id=977395.977673. Google ScholarDigital Library
Chunho Lee, Miodrag Potkonjak, and William H. Mangione-Smith. 1997. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons Systems. In Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO 30). IEEE Computer Society, Washington, DC, USA, 330--335. http://dl.acm.org/citation.cfm?id=266800.266832. Google ScholarDigital Library
Jaejin Lee, Changhee Jung, Daeseob Lim, and Yan Solihin. 2009. Prefetching with Helper Threads for Loosely Coupled Multiprocessor Systems. IEEE Transactions on Parallel and Distributed Systems 20, 9 (2009), 1309--1324. Google ScholarDigital Library
Jaejin Lee, Jung-Ho Park, Honggyu Kim, Changhee Jung, Daeseob Lim, and SangYong Han. 2010. Adaptive Execution Techniques of Parallel Programs for Multiprocessors. J. Parallel Distrib. Comput. 70, 5 (May 2010), 467--480. Google ScholarDigital Library
Kyoungwoo Lee, Aviral Shrivastava, Ilya Issenin, Nikil Dutt, and Nalini Venkatasubramanian. 2006. Mitigating Soft Error Failures for Multimedia Applications by Selective Data Protection. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). ACM, New York, NY, USA, 411--420. Google ScholarDigital Library
Sangho Lee, Changhee Jung, and Santosh Pande. 2014. Detecting Memory Leaks through Introspective Dynamic Behavior Modelling using Machine Learning. In Proceedings of the 36th International Conference on Software Engineering. Google ScholarDigital Library
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. In Proceedings of the 42Nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42). ACM, New York, NY, USA, 469--480. Google ScholarDigital Library
Yuan Lin and et al. 2006. SODA: A Low-power Architecture for Software Radio.Google ScholarDigital Library
Qingrui Liu and Changhee Jung. 2016. Lightweight Hardware Support for Transparent Consistency-aware Checkpointing in Intermittent Energy-harvesting Systems. In Proceedings of the IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA).Google ScholarCross Ref
Qingrui Liu, Changhee Jung, Dongyoon Lee, and Devesh Tiwari. 2015. Clover: Compiler Directed Lightweight Soft Error Resilience. In Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2015 CD-ROM (LCTES’15). ACM, New York, NY, USA, Article 2, 10. Google ScholarDigital Library
Qingrui Liu, Changhee Jung, Dongyoon Lee, and Devesh Tiwari. 2016. Compiler-directed Lightweight Checkpointing for Fine-grained Guaranteed Soft Error Recovery. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC). Google ScholarDigital Library
Qingrui Liu, Changhee Jung, Dongyoon Lee, and Devesh Tiwari. 2016. Compiler Directed Soft Error Detection and Recovery to Avoid DUE and SDC via Tail-DMR. ACM Transactions on Embedded Computing Systems (TECS) XX, X (2016). Google ScholarDigital Library
Qingrui Liu, Changhee Jung, Dongyoon Lee, and Devesh Tiwari. 2016. Low-cost Soft Error Resilience with Unified Data Verification and Fine-grained Recovery. In Proceedings of the 49th International Symposium on Microarchitecture (MICRO).Google Scholar
Shan Lu, Zhenmin Li, Feng Qin, Lin Tan, Pin Zhou, and Yuanyuan Zhou. 2005. Bugbench: Benchmarks for Evaluating Bug Detection Tools. In In Workshop on the Evaluation of Software Defect Detection Tools.Google Scholar
Shubhendu S. Mukherjee, Joel Emer, and Steven K. Reinhardt. 2005. The Soft Error Problem: An Architectural Perspective. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA’05). 243--247. Google ScholarDigital Library
S. S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin. 2003. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High Performance Microprocessor. 29--42. Google ScholarDigital Library
J. C. Nunnally. 1978. Psychometric theory. McGraw-Hill. 77023950. https://books.google.com/books?id=WE59AAAAMAAJ.Google Scholar
Nadja Peters, Sangyoung Park, Samarjit Chakraborty, Benedikt Meurer, Hannes Payer, and Daniel Clifford. 2016. Web Browser Workload Characterization for Power Management on HMP Platforms. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis. ACM, 26. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, Lieven Eeckhout, and Lizy Kurian John. 2005. Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites. In IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2005, March 20-22, 2005, Austin, Texas, USA, Proceedings. 10--20. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, and Lizy K. John. 2007. Analysis of Redundancy and Application Balance in the SPEC CPU2006 Benchmark Suite. SIGARCH Comput. Archit. News 35, 2 (June 2007), 412--423. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, and Lizy Kurian John. 2007. Analysis of Redundancy and Application Balance in the SPEC CPU2006 Benchmark Suite. ISCA. ACM. Google ScholarDigital Library
Aashish Phansalkar, Ajay Joshi, and Lizy K John. 2007. Analysis of Redundancy and Application Balance in the SPEC CPU2006 Benchmark Suite. In ACM SIGARCH Computer Architecture News, Vol. 35. ACM, 412--423. Google ScholarDigital Library
Jason A. Poovey, Thomas M. Conte, Markus Levy, and Shay Gal-On. 2009. A Benchmark Characterization of the EEMBC Benchmark Suite. IEEE Micro 29, 5 (2009), 18--29. Google ScholarDigital Library
Peter Sassone, D. Scott Wills, and Gabriel Loh. 2005. Static Strands: Safely Collapsing Dependence Chains for Increasing Embedded Power Efficiency. 127--136.Google ScholarDigital Library
Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically Characterizing Large Scale Program Behavior. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS X). ACM, New York, NY, USA, 45--57. Google ScholarDigital Library
Aviral Shrivastava, Illya Issenin, and Nikil Dutt. 2005. Compilation Techniques for Energy Reduction in Horizontally Partitioned Cache Architectures.Google Scholar
Ashish Shrivastava, Abhishek Rhisheekesan, Reiley Jeyapaul, and Carole-Jean Wu. 2014. Quantitative Analysis of Control Flow Checking Mechanisms for Soft Errors. In Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE. IEEE, 1--6. Google ScholarDigital Library
Vilas Sridharan and David R Kaeli. 2010. Using Hardware Vulnerability Factors to Enhance AVF Analysis. In ACM SIGARCH Computer Architecture News. Google ScholarDigital Library
Devesh Tiwari and Yan Solihin. 2012. Architectural Characterization and Similarity Analysis of Sunspider and Google’s V8 Javascript Benchmarks.. In ISPASS. IEEE Computer Society, 221--232. Google ScholarDigital Library
V. Tiwari, S. Malik, and A. Wolfe. 1994. Power Analysis of Embedded Software: A First Step Towards Software Power Minimization. 2, 4 (1994), 437--445. Google ScholarDigital Library
Jan Vitek and Tomas Kalibera. 2011. Repeatability, Reproducibility, and Rigor in Systems Research. In Proceedings of the Ninth ACM International Conference on Embedded Software. ACM, 33--38. Google ScholarDigital Library
Joshua J Yi, Resit Sendag, Lieven Eeckhout, Ajay Joshi, David J Lilja, and Lizy K John. 2006. Evaluating Benchmark Subsetting Approaches. In Workload Characterization, 2006 IEEE International Symposium on. IEEE, 93--104.Google ScholarCross Ref
Jie Yu, Satish Narayanasamy, Cristiano Pereira, and Gilles Pokam. 2012. Maple: A Coverage-driven Testing Tool for Multithreaded Programs. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’12). ACM, New York, NY, USA, 485--502. Google ScholarDigital Library
Tong Zhang, Changhee Jung, and Dongyoon Lee. 2017. ProRace: Practical Data Race Detection for Production Use. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 149--162. Google ScholarDigital Library
Tong Zhang, Dongyoon Lee, and Changhee Jung. 2016. Txrace: Efficient Data Race Detection using Commodity Hardware Transactional Memory. In ACM SIGOPS Operating Systems Review, Vol. 50. ACM, 159--173. Google ScholarDigital Library

Index Terms

BenchPrime: Effective Building of a Hybrid Benchmark Suite
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems
      1. Embedded software
2. General and reference
  1. Cross-computing tools and techniques

Recommendations

The DaCapo benchmarks: java benchmarking development and analysis
OOPSLA '06: Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications

Since benchmarks drive computer science research and industry product development, which ones we use and how we evaluate them are key questions for the community. Despite complex runtime tradeoffs due to dynamic compilation and garbage collection ...
Read More
SPEC MPI2007—an application benchmark suite for parallel systems using MPI
International Supercomputing Conference (ISC07)

The SPEC High-Performance Group has developed the benchmark suite SPEC MPI2007 and its run rules over the last few years. The purpose of the SPEC MPI2007 benchmark and its run rules is to further the cause of fair and objective benchmarking of high-...
Read More
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 2006 OOPSLA Conference

Since benchmarks drive computer science research and industry product development, which ones we use and how we evaluate them are key questions for the community. Despite complex runtime tradeoffs due to dynamic compilation and garbage collection ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Embedded Computing Systems Volume 16, Issue 5s
Special Issue ESWEEK 2017, CASES 2017, CODES + ISSS 2017 and EMSOFT 2017
October 2017
1448 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3145508
Editor:
Sandeep K. Shukla
Indian Institute of Technology, India
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 27 September 2017
- Accepted: 1 July 2017
- Revised: 1 June 2017
- Received: 1 April 2017
Published in tecs Volume 16, Issue 5s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Benchmark
linear discriminant analysis
principle component analysis
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 14
  Total Citations
  View Citations
- 363
  Total Downloads
- Downloads (Last 12 months)59
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.