skip to main content
10.1145/2968455.2968517acmotherconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Runtime management of adaptive MPSoCs for graceful degradation

Published: 01 October 2016 Publication History

Abstract

In this paper we propose optimization algorithms for the runtime management of gracefully degradable adaptive MP-SoCs. Assuring the reliability of all hardware components in a system becomes increasingly difficult. On top of the growing defect densities and rising complexity of conventional testing, wear-out effects may reduce the availability of on-chip resources during system lifetime. However, adaptability of modern MPSoCs can provide the means for permanent fault tolerance and graceful degradation via runtime system management. We have developed custom heuristics as well as tailored existing optimization techniques (simulated annealing and genetic algorithm), to deliver a fast and efficient response to unpredictable loss of system resources. We have emulated the resulting runtime manager on the Intel Single-Chip Cloud Computer (SCC), an experimental chip multiprocessor developed by Intel Labs. Comparison of the different algorithms in terms of solution quality and response time, and the scaling of their response time with the size of problem input, indicate that our custom heuristics are faster by at least one order of magnitude, but simulated annealing and genetic algorithm are more consistent in dealing with constraints to the allowed solutions, e.g. limited system reconfiguration time. All algorithms scale well, since their response time, in almost every case, grows sub-linearly with respect to the input size.

References

[1]
http://www.autosar.org/.
[2]
http://www.iso.org/iso/catalogue_detail?csnumber=43464.
[3]
T. Austin, V. Bertacco, S. Mahlke, and Y. Cao. Reliable systems on unreliable fabrics. IEEE Design Test of Comp., 25(4):322--332, 2008.
[4]
A. M. C. Bolchini, L. Cassano. Lifetime-aware load distribution policies in multi-core systems: An in-depth analysis. DATE '16, pages 804--809, 2016.
[5]
S. Chakradhar and A. Raghunathan. Best-effort computing: Re-thinking parallel software and hardware. In ACM/IEEE DAC, pages 865--870, 2010.
[6]
A. Das and A. Kumar. Fault-aware task re-mapping for throughput constrained multimedia applications on noc-based mpsocs. In RSP, pages 149--155, 2012.
[7]
O. Derin, D. Kabakci, and L. Fiorin. Online task remapping strategies for fault-tolerant network-on-chip multiprocessors. In ACM/IEEE NOCs, pages 129--136, 2011.
[8]
S. Feng, S. Gupta, A. Ansari, and S. A. Mahlke. Maestro: Orchestrating lifetime reliability in chip multiprocessors. In HiPEAC Conf., 2010.
[9]
S. L. Graham, P. B. Kessler, and M. K. Mckusick. Gprof: A call graph execution profiler. SIGPLAN Not., 17(6):120--126, June 1982.
[10]
M. Gries, U. Hoffmann, M. Konow, and M. Riepen. Scc: A flexible architecture for many-core platform research. Computing in Science Engineering, 13(6):79--83, Nov 2011.
[11]
S. Gupta, S. Feng, A. Ansari, J. Blome, and S. Mahlke. The stagenet fabric for constructing resilient multicore systems. In IEEE/ACM MICRO 41, pages 141--151, 2008.
[12]
M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. Mibench: A free, commercially representative embedded benchmark suite. In WWC-4, pages 3--14, 2001.
[13]
A. Kansal and F. Zhao. Fine-grained energy profiling for power-aware application design. SIGMETRICS Perform. Eval. Rev., 36(2):26--31, Aug. 2008.
[14]
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.
[15]
A. Malek, I. Sourdis, S. Tzilis, Y. He, and G. Rauwerda. Rqnoc: A resilient quality-of-service network-on-chip with service redirection. ACM Trans. Embed. Comput. Syst., 15(2):28:1--28:25, Feb. 2016.
[16]
M. Mitchell. An Introduction to Genetic Algorithms. MIT Press, Cambridge, MA, USA, 1998.
[17]
D. Mohapatra, V. Chippa, A. Raghunathan, and K. Roy. Design of voltage-scalable meta-functions for approximate computing. In DATE, 2011.
[18]
T. S. Muthukaruppan, M. Pricopi, V. Venkataramani, T. Mitra, and S. Vishin. Hierarchical power management for asymmetric multi-core in dark silicon era. In DAC, pages 174:1--174:9, 2013.
[19]
B. Nahar and B. H. Meyer. Rotr: Rotational redundant task mapping for fail-operational mpsocs. In IEEE DFTS, 2015.
[20]
C. P. Robert and G. Casella. Introducing Monte Carlo methods with R. Use R! Springer, 2010.
[21]
T. Saridakis. Design patterns for graceful degradation. Trans on Pattern Languages of Progr., 1:67--93, 2009.
[22]
I. Sourdis, D. A. Khan, A. Malek, S. Tzilis, G. Smaragdos, and C. Strydis. Resilient chip multiprocessors with mixed-grained reconfigurability. IEEE Micro, 36(1):35--45, 2016.
[23]
I. Sourdis et al. Desyre: On-demand system reliability. Microprocessors and Microsystems, 37(8, Part C), 2013.
[24]
N. R. Storey. Safety Critical Computer Systems. Addison-Wesley Longman Publishing Co., Inc., 1996.
[25]
S. Tzilis and I. Sourdis. A runtime manager for gracefully degrading socs. In IEEE DFTs, pages 216--221, Oct 2014.
[26]
V. Vasilikos, G. Smaragdos, C. Strydis, and I. Sourdis. Heuristic search for adaptive, defect-tolerant multiprocessor arrays. ACM Trans. Embedded Comput. Syst., 12(1s):44, 2013.
[27]
V. Vassiliadis et al. A programming model and runtime system for significance-aware energy-efficient computing. In ACM SIGPLAN PPoPP, pages 275--276, 2015.

Cited By

View all
  • (2017)A Survey and Comparative Study of Hard and Soft Real-Time Dynamic Resource Allocation Strategies for Multi-/Many-Core SystemsACM Computing Surveys10.1145/305726750:2(1-40)Online publication date: 11-Apr-2017
  • (2017)Literature Survey on System-Level Optimizations TechniquesReliable and Energy Efficient Streaming Multiprocessor Systems10.1007/978-3-319-69374-3_3(33-44)Online publication date: 4-Nov-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CASES '16: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems
October 2016
187 pages
ISBN:9781450344821
DOI:10.1145/2968455
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ESWEEK'16
ESWEEK'16: TWELFTH EMBEDDED SYSTEM WEEK
October 1 - 7, 2016
Pennsylvania, Pittsburgh

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2017)A Survey and Comparative Study of Hard and Soft Real-Time Dynamic Resource Allocation Strategies for Multi-/Many-Core SystemsACM Computing Surveys10.1145/305726750:2(1-40)Online publication date: 11-Apr-2017
  • (2017)Literature Survey on System-Level Optimizations TechniquesReliable and Energy Efficient Streaming Multiprocessor Systems10.1007/978-3-319-69374-3_3(33-44)Online publication date: 4-Nov-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media