Abstract
The increasing number of units in today's systems-on-chip and multicore processors has led to complex intra-chip communication solutions. Specifically, Networks-on-Chip (NoCs) have emerged as a favorable fabric to provide high bandwidth and low latency in connecting many units in a same chip. To achieve these goals, the NoC often includes complex components and advanced features, leading to the development of large and highly complex interconnect subsystems. One of the biggest challenges in these designs is to ensure the correct functionality of this communication infrastructure. To support this goal, an increasing fraction of the validation effort has shifted to post-silicon validation, because it permits exercising network activities that are too complex to be validated in pre-silicon. However, post-silicon validation is hindered by the lack of observability of the network's internal operations and thus, diagnosing functional errors during this phase is very difficult.
In this work, we propose a post-silicon validation platform that improves observability of network operations by taking periodic snapshots of the traffic traversing the network. Each node's local cache is configured to temporarily store the snapshot logs in a designated area reserved for post-silicon validation and relinquished after product release. Each snapshot log is analyzed locally by a software algorithm running on its corresponding core, in order to detect functional errors. Upon error detection, all snapshot logs are aggregated at a central location to extract additional debug data, including an overview of network traffic surrounding the error event, as well as a partial reconstruction of the routes followed by packets in flight at the time. In our experiments, we found that this approach allows us to detect several types of functional errors, as well as observe, on average, over 50% of the network's traffic and reconstruct at least half of each of their routes through the network.
- M. Abramovici. 2008. In-system silicon validation and debug. IEEE Des. Test Comput. 25, 3, 216--223. Google ScholarDigital Library
- M. Abramovici, P. Bradley, K. Dwarakanath, P. Levin, G. Memmi, and D. Miller. 2006. A reconfigurable design-for-debug infrastructure for socs. In Proceedings of the 43rd Annual Design Automation Conference (DAC'06). Google ScholarDigital Library
- M. Al Faruque, G. Weiss, and J. Henkel. 2006. Bounded arbitration algorithm for qos-supported onchip communication. In Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES/ISSS'06). 76--81. Google ScholarDigital Library
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. 2008. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT'08). Google ScholarDigital Library
- D. Chatterjee, C. McCarter, and V. Bertacco. 2011. Simulation-based signal selection for state restoration in silicon debug. In Proceedings of the International Conference on Computer-Aided Design (ICCAD'11). Google ScholarDigital Library
- C. Ciordas, T. Basten, A. Radulescu, K. Goossens, and J. Meerbergen. 2004. An event-based network-on-chip monitoring service. In Proceedings of the High Level Design Validation and Test Workshop (HLDVT'04). Google ScholarDigital Library
- C. Ciordas, K. Goossens, T. Basten, A. Radulescu, and A. Boon. 2006. Transaction monitoring in networks on chip: The on-chip run-time perspective. In Proceedings of the International Symposium on Industrial Embedded Systems (IES'06).Google Scholar
- W. Dally and B. Towles. 2003. Principles and Practices of Interconnection Networks. Morgan Kaufmann. Google ScholarDigital Library
- R. Das, S. Eachempati, A. Mishra, V. Narayanan, and C. Das. 2009. Design and evaluation of a hierarchical on-chip interconnect for next-generation cmps. In Proceedings of the 15th IEEE International Symposium on High Performance Computer Architecture (HPCA'09). 175--186.Google Scholar
- A. Deorio, A. Bauserman, and V. Bertacco. 2008. Post-silicon verification for cache coherence. In Proceedings of the International Conference on Computer Design (ICCD'08).Google Scholar
- A. Deorio, I. Wagner, and V. Bertacco. 2009. Dacota: Post-silicon validation of the memory subsystem in multi-core designs. In Proceedings of the International Symposium on High Performance Computing Architecture (HPCA'09).Google Scholar
- IEEE STD.1149.1. 1990. IEEE standard test access s port and boundary scan architecture. IEEE Std. 1149.1-1990.Google Scholar
- S. M. A. H. Jafri, L. Guang, A. Jantsch, K. Paul, A. Hemani, and H. Tenhunen. 2012. Self-adaptive noc power management with dual-level agents - Architecture and implementation. In Proceedings of the 2nd International Conference on Pervasive and Embedded Computing and Communication Systems (PECCS'12). 450--458. Google ScholarDigital Library
- G. Kim, J. Kim, and S. Yoo. 2011. Flexibuffer: Reducing leakage power in on-chip network routers. In Proceedings of the 48th Design Automation Conference (DAC'11). ACM Press, New York, 936--941. Google ScholarDigital Library
- H. F. Ko and N. Nicolici. 2008. Automated trace signals identification and state restoration for improving observability in post-silicon validation. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'08). Google ScholarDigital Library
- H. F. Ko and N. Nicolici. 2010. Automated trace signals selection using the rtl descriptions. In Proceedings of the International Test Conference (ITC'10).Google Scholar
- T. Krishna, L.-S. Peh, B. M. Beckmann, and S. K. Reinhardt. 2011. Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'11). ACM Press, New York, 71--82. Google ScholarDigital Library
- C.-H. Lai, F.-C. Yang, C.-F. Kao, and I.-J. Huang. 2009. A trace-capable instruction cache for cost efficient real-time program trace compression in soc. In Proceedings of the 46th Annual Design Automation Conference (DAC'09). Google ScholarDigital Library
- L. Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Comm. ACM 21, 7, 558--565. Google ScholarDigital Library
- Z. Li, C. Zhu, L. Shang, R. Dick, and Y. Sun. 2008. Transaction-aware network-on-chip resource reservation. IEEE Comput. Archit. Lett. 7, 2, 53--56. Google ScholarDigital Library
- X. Liu and Q. Xu. 2009. Trace signal selection for visibility enhancement in post-silicon validation. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). Google ScholarDigital Library
- Z. Lv, H. Chen, F. Chen, and Y. Lv. 2011. Fast verification of memory consistency for chip multiprocessor. In Proceedings of the 7th International Conference on Computational Intelligence and Security (CIS'11). Google ScholarDigital Library
- A. K. Mishra, S. Srikantaiah, M. Kandemir, and C. R. Das. 2010. Cpm in cmps: Coordinated power management in chip-multiprocessors. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10). IEEE Computer Society, 1--12. Google ScholarDigital Library
- P. R. Panda, M. Balakrishnan, and A. Vishnoi. 2011. Compressing cache state for postsilicon processor debug. IEEE Trans. Comput. 60, 4, 484--497. Google ScholarDigital Library
- P. R. Panda, A. Vishnoi, and M. Balakrishnan. 2010. Enhancing post-silicon processor debug with incremental cache state dumping. In Proceedings of the 18th IEEE/IFIP VLSI System on Chip Conference (VLSI-SoC'10). 55--60.Google Scholar
- R. Parikh and V. Bertacco. 2011. Formally enhanced runtime verification to ensure noc functional correctness. In Proceedings of the International Symposium on Microarchitecture (MICRO'11). Google ScholarDigital Library
- S.-B. Park, A. Bracy, H. Wang, and S. Mitra. 2010. Blog: Post-silicon bug localization in processors using bug localization graphs. In Proceedings of the 47th Design Automation Conference (DAC'10). Google ScholarDigital Library
- S.-B. Park, T. Hong, and S. Mitra. 2009. Post-silicon bug localization in processors using instruction footprint recording and analysis (ifra). Trans. Comput.-Aided Des. Integr. Circ. Syst. 28, 10, 1545--1558. Google ScholarDigital Library
- H. Rotithor. 2000. Post-silicon validation methodology for microprocessors. IEEE Des. Test 17, 4, 77--88. Google ScholarDigital Library
- S. Stuijk, T. Basten, M. Geilen, A. Ghamarian, and B. Theelen. 2006. Resource-efficient routing and scheduling of time-constrained network-on-chip communication. In Proceedings of the 9th EUROMICRO Conference on Digital System Design: Architectures, Methods and Tools (DSD'06). 45--52. Google ScholarDigital Library
- S. Tang and Q. Xu. 2007. A multi-core debug platform for noc-based systems. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'07). Google ScholarDigital Library
- J. Van den Brand. 2005. Runtime networks-on-chip performance monitoring. M.S. thesis, Technische Universiteit Eindhoven.Google Scholar
- B. Vermeulen and K. Goossens. 2009. A network-on-chip monitoring infrastructure for communication-centric debug of embedded multi-processor socs. In Proceedings of the International Symposium on VLSI Design, Automation and Test (VLSI/DAT'09).Google Scholar
- B. Vermeulen, S. Oostdijk, and F. Bouwman. 2001. Test and debug strategy of the pnx8525 nexperiatm digital video platform system chip. In Proceedings of the IEEE International Test Conference (ITC'01). Google ScholarDigital Library
- A. Vishnoi, P. Panda, and M. Balakrishnan. 2009. Cache aware compression for processor debug support. In Design, Automation Test in Europe Conference Exhibition (DATE'09). Google ScholarDigital Library
- I. Wagner and V. Bertacco. 2008. Reversi: Post-silicon validation system for modern microprocessors. In Proceedings of the International Conference on Computer Design (ICCD'08). Google ScholarDigital Library
- J.-S. Yang and N. A. Touba. 2009. Automated selection of signals to observe for efficient silicon debug. In Proceedings of VLSI Test Symposium (VTS'09). 79--84. Google ScholarDigital Library
- H. Yi, S. Park, and S. Kundu. 2008. A design-for-debug (dfd) for noc-based soc debugging via noc. In Proceedings of the Asian Test Symposium (ATS'08). Google ScholarDigital Library
- H. Yi, S. Park, and S. Kundu. 2010. On-chip support for NoC-based SoC debugging. IEEE Trans. Circ. Syst. 57, 7. Google ScholarDigital Library
Index Terms
- Post-silicon platform for the functional diagnosis and debug of networks-on-chip
Recommendations
Leveraging dark silicon to optimize networks-on-chip topology
This paper presents a reconfigurable network-on-chip (NoC) for many-core chip multiprocessors (CMPs) in the dark silicon era, where a considerable part of high-end chips cannot be powered up due to the power and bandwidth walls. Core specialization, ...
Functional post-silicon diagnosis and debug for networks-on-chip
ICCAD '12: Proceedings of the International Conference on Computer-Aided DesignNetworks-on-chip (NoCs) have emerged as a favorable solution to provide higher bandwidth interconnects for large chip multiprocessors (CMPs). In order to enhance the inter-connect's performance, the NoC is often designed to include complex components ...
Time-division-multiplexed arbitration in silicon nanophotonic networks-on-chip for high-performance chip multiprocessors
As the computational performance of microprocessors continues to grow through the integration of an increasing number of processing cores on a single die, the interconnection network has become the central subsystem for providing the communications ...
Comments