ABSTRACT
As silicon technology scales, modern processors and embedded systems are rapidly shifting towards complex chip multi-processor (CMP) and system-on-chip (SoC) designs, comprising several processor cores and IP components communicating via a network-on-chip (NoC). As a side-effect of this trend, ensuring their correctness has become increasingly problematic. In particular, the network-on-chip often includes complex features and components to support the required communication bandwidth among the nodes in the system. In this landscape, it is no wonder that design errors in the NoC may go undetected and escape into the final silicon, with potential detrimental impact on the overall system.
In this work, we propose ForEVeR, a solution that complements the use of formal methods and runtime verification to ensure functional correctness in NoCs. Formal verification, due to its scalability limitations, is used to verify the smaller modules, such as individual router components. We complete the protection against escaped design errors with a runtime technique, a network-level error detection and recovery solution, which monitors the traffic in the NoC and protects it against escaped functional bugs that affect the communication paths in the network. To this end, ForEVeR augments the baseline NoC with a lightweight checker network that alerts destination nodes of incoming packets ahead of time. If a bug is detected, flagged by missed packet arrivals, a recovery mechanism delivers the in-flight data safely to the intended destination via the checker network. ForEVeR's experimental evaluation shows that it can recover from NoC design errors at only 4.8% area cost for an 8x8 mesh interconnect, with a recovery performance cost of less than 30K cycles per functional bug manifestation. Additionally, it incurs no performance overhead in the absence of errors.
- R. Abdel-Khalek, R. Parikh, A. DeOrio, and V. Bertacco. Functional correctness for CMP interconnects. In Proc. ICCD, 2011. Google ScholarDigital Library
- K. V. Anjan and T. M. Pinkston. An efficient, fully adaptive deadlock recovery scheme: DISHA. In Proc. ISCA, 1995. Google ScholarDigital Library
- T. M. Austin. DIVA: A reliable substrate for deep submicron microarchitecture design. In Proc. MICRO, 1999. Google ScholarDigital Library
- A. A. Bayazit and S. Malik. Complementary use of runtime validation and model checking. In Proc. ICCAD, 2005. Google ScholarDigital Library
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In Proc. PACT, October 2008. Google ScholarDigital Library
- D. Borrione, A. Helmy, L. Pierre, and J. Schmaltz. A generic model for formally verifying noc communication architectures: A case study. In Proc. NoCs, 2007. Google ScholarDigital Library
- M. Boule, J.-S. Chenard, and Z. Zilic. Assertion checkers in verification, silicon debug and in-field diagnosis. In Proc. ISQED, 2007. Google ScholarDigital Library
- W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., 2003. Google ScholarDigital Library
- H. Foster, L. Loh, B. Rabii, and V. Singhal. Guidelines for creating a formal verification testplan. In Proc. DVCon, 2006.Google Scholar
- J. Kim and H. Kim. Router microarchitecture and scalability of ring topology in on-chip networks. In Proc. NoCArc, 2009. Google ScholarDigital Library
- P. Lopez, J. M. Martinez, and J. Duato. A very efficient distributed deadlock detection mechanism for wormhole networks. In Proc. HPCA, 1998. Google ScholarDigital Library
- J. M. Martínez, et al. Software-based deadlock recovery technique for true fully adaptive routing in wormhole networks. In Proc. ICPP, 1997. Google ScholarDigital Library
- A. Meixner, M. E. Bauer, and D. Sorin. Argus: Low-cost, comprehensive error detection in simple cores. In Proc. MICRO, 2007. Google ScholarDigital Library
- S. Murali, T. Theocharides, N. Vijaykrishnan, M. Irwin, L. Benini, and G. De Micheli. Analysis of error recovery schemes for networks on chips. IEEE Design & Test, 22(5), 2005. Google ScholarDigital Library
- G. Nychis, C. Fallin, T. Moscibroda, and O. Mutlu. Next generation on-chip networks: what kind of congestion control do we need? In Proc. Hotnets, 2010. Google ScholarDigital Library
- D. Park, C. Nicopoulos, J. Kim, N. Vijaykrishnan, and C. R. Das. Exploring fault-tolerant network-on-chip architectures. In Proc. DSN, 2006. Google ScholarDigital Library
- D. Starobinski, M. Karpovsky, and L. A. Zakrevski. Application of network calculus to general topologies using turn-prohibition. IEEE/ACM Trans. Networks, 11(3), 2003. Google ScholarDigital Library
- Synopsys. Synopsys Magellan. http://www.synopsys.com.Google Scholar
- I. Wagner and V. Bertacco. Engineering trust with semantic guardians. In Proc. DATE, 2007. Google ScholarDigital Library
- Formally enhanced runtime verification to ensure NoC functional correctness
Recommendations
ForEVeR: A complementary formal and runtime verification approach to correct NoC functionality
Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular PapersAs silicon technology scales, modern processor and embedded systems are rapidly shifting towards complex chip multi-processor (CMP) and system-on-chip (SoC) designs. As a side effect of complexity of these designs, ensuring their correctness has become ...
Using NoC routers as processing elements
SBCCI '09: Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the DunesThe integration technology has increased at the point where the development of Multi-Core processor architectures is a market reality nowadays. In this scenario, the interconnection network has a critical function when the number of cores increases, ...
MIRA: A Multi-layered On-Chip Interconnect Router Architecture
Recently, Network-on-Chip (NoC) architectures have gained popularity to address the interconnect delay problem for designing CMP / multi-core / SoC systems in deep sub-micron technology. However, almost all prior studies have focused on 2D NoC designs. ...
Comments