skip to main content
10.1145/1366230.1366258acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Credit-based dynamic reliability management using online wearout detection

Published:05 May 2008Publication History

ABSTRACT

As circuit geometries continue to shrink, and supply voltages remain relatively constant, circuit wearout becomes a concern. We propose that the relative reliability of the circuits of a processor be exposed to the operating system, and be managed by a credit-based wearout monitor. This wearout monitor receives dynamic updates of the reliability of circuits through the use of stability detector circuits that are small enough to be widely deployed. We find that through the combined use of the wearout monitor and stability detectors, we can efficiently and accurately manage the reliability of a processor, and re-coup the performance of a processor that would otherwise be lost when processors are over-provisioned to meet an expected lifetime. We simulate a 16 core DSP with a wearout monitor and stability detectors on a mix of four different media algorithms. Using the wearout monitor and stability detectors, we find that by reducing average performance by only 5%, we can increase the lifetime of the processor by 46%.

References

  1. P. Franco and E. McCluskey, "On-line delay testing of digital circuits," in Proceedings, 12th IEEE VLSI Test Symposium, 1994, IEEE Computer Society, 1994.Google ScholarGoogle Scholar
  2. J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, "The impact of technology scaling on lifetime reliability," in In Proc. of International Conference on Dependable Systems and Networks (DSN), 2004., 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, "A reliability odometer - lemon check your processor!," in The Wild and Crazy Idea Session IV, in conjunction with ASPLOS XI, 2004.Google ScholarGoogle Scholar
  4. J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, "The case for lifetime reliability-aware microprocessors," in ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, (Washington, DC, USA), p. 276, IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan, "Temperature-aware microarchitecture," in ISCA '03: Proceedings of the 30th annual international symposium on Computer architecture, (New York, NY, USA), pp. 2--13, ACM Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Z. Lu, J. Lach, M. R. Stan, and K. Skadron, "Improved thermal management with reliability banking," IEEE Micro, vol. 25, no. 6, pp. 40--49, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Z. Lu, W. Huang, J. Lach, M. Stan, and K. Skadron, "Interconnect lifetime prediction under dynamic stress for reliability-aware design," in ICCAD '04: Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design, (Washington, DC, USA), pp. 327--334, IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, "Exploiting structural duplication for lifetime reliability enhancement," in ISCA '05: Proceedings of the 32nd annual international symposium on Computer Architecture, (Washington, DC, USA), pp. 520--531, IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Blome, S. Gupta, S. Feng, S. Mahlke, and D. Bradley, "Online timing analysis for wearout detection," in The Second Workshop on Architectural Reliability (WAR), 2006., 2006.Google ScholarGoogle Scholar
  10. J. Blome, S. Feng, S. Gupta, and S. Mahlke, "Self calibrating online wearout detection," MICRO 40: Proceedings of the 40th annual ACM/IEEE international symposium on Microarchitecture, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and T. Mudge, "Razor: A low-power pipeline based on circuit-level timing speculation," in MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, (Washington, DC, USA), p. 7, IEEE Computer Society, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Joint Electron Device Engineering Council, "Failure mechanisms and models for semiconductor devices." www.jedec.org/ download/search/jep122C.pdf, 2006.Google ScholarGoogle Scholar
  13. ITRS, International Technology Roadmap For Semiconductors - 2006 Edition, System Drivers. Semiconductor Industry Association, 2006.Google ScholarGoogle Scholar
  14. P. Franco, "Testing digital circuits for timing failures by output waveform analysis," Dissertation, Stanford University, 1994.Google ScholarGoogle Scholar
  15. K. Bernstein, D. J. Frank, A. E. Gattiker, W. Haensch, B. L. Ji, S. R. Nassif, E. J. Nowak, D. J. Pearson, and N. J. Rohrer, "High-performance cmos variability in the 65-nm regime and beyond," IBM J. Res. Dev., vol. 50, no. 4/5, pp. 433--449, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. C. Burger and T. M. Austin, "The simplescalar tool set, version 2.0," Technical Report CS-TR-1997-1342, University of Wisconsin, Madison, June 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. Hendrickson and R. Leland, "The chaco user's guide, version 2.0, technical report sand94-2692," 1994. http://www.ti.com/ corp/docs/press/backgrounder/omap.shtml.Google ScholarGoogle Scholar
  18. U. SMART Interconnect Group, "Flexsim 1.2 flit level simulator." http://ceng.usc.edu/smart/tools.html.Google ScholarGoogle Scholar
  19. X. Chen and L.-S. Peh, "Leakage power modeling and optimization in interconnection networks," in ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design, pp. 90--95, ACM Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Ho, K. Mai, and M. Horowitz, "The future of wires," in Proceedings of the IEEE, vol. 89, pp. 490--504, April 2001.Google ScholarGoogle ScholarCross RefCross Ref
  21. R. Ho, K. Mai, and M. Horowitz, "Efficient on-chip global interconnects," in IEEE Symposium on VLSI Circuits, June 2003. Stanford Univeristy.Google ScholarGoogle Scholar

Index Terms

  1. Credit-based dynamic reliability management using online wearout detection

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            CF '08: Proceedings of the 5th conference on Computing frontiers
            May 2008
            334 pages
            ISBN:9781605580777
            DOI:10.1145/1366230

            Copyright © 2008 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 5 May 2008

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate240of680submissions,35%

            Upcoming Conference

            CF '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader