ABSTRACT
Separable allocators in on-chip routers perform switch allocation in two stages that often make uncoordinated decisions resulting in sub-optimal switch allocation. We propose Virtual Input Crossbars (VIX), where more than one virtual channel (VC) of an input port is connected to the crossbar. VIX improves switch allocation by allowing more than one input VC of an input port to transmit flits in the same cycle. Also, more input VCs can participate in the output arbitration, reducing the chances of uncoordinated decisions. VIX improves network throughput by more than 15% for the topologies studied without affecting the router critical path.
- T. E. Anderson, S. S. Owicki, J. B. Saxe, and C. P. Thacker. High-speed switch scheduling for local-area networks. ACM Transactions on Computer Systems (TOCS), 1993. Google ScholarDigital Library
- J. D. Balfour and W. J. Dally. Design tradeoffs for tiled cmp on-chip networks. In ICS, 2006. Google ScholarDigital Library
- D. U. Becker. Efficient microarchitecture for network-on-chip routers. PhD thesis, Stanford University, 2012.Google Scholar
- D. U. Becker and W. J. Dally. Allocator implementations for network-on-chip routers. In SC, 2009. Google ScholarDigital Library
- Y. Chang, Y. S.-C. Huang, M. Poremba, V. Narayanan, Y. Xie, and C.-T. Kin. Ts-router: On maximizing the quality-of-allocation in the on-chip network. In HPCA-19, 2013. Google ScholarDigital Library
- W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003. Google ScholarDigital Library
- W. J. Dally and B. Towles. Route packets, not wires: On-chip inteconnection networks. In DAC-38, 2001. Google ScholarDigital Library
- L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canadian Journal of Mathematics, 1956.Google Scholar
- M. Galles. Scalable pipelined interconnect for distributed endpoint routing: the sgi spider chip. In Symposium on High Performance Interconnects (Hot Interconnects), 1996.Google Scholar
- J. Howard, S. Dighe, Y. Hoskote, S. R. Vangal, D. Finan, G. Ruhl, D. Jenkins, H. Wilson, N. Borkar, G. Schrom, F. Pailet, S. Jain, T. Jacob, S. Yada, S. Marella, P. Salihundam, V. Erraguntla, M. Konow, M. Riepen, G. Droege, J. Lindemann, M. Gries, T. Apel, K. Henriss, T. Lund-Larsen, S. Steibl, S. Borkar, V. De, R. F. V. der Wijngaart, and T. G. Mattson. A 48-core ia-32 message-passing processor with dvfs in 45nm cmos. In ISSCC, 2010.Google ScholarCross Ref
- J. Kim, J. Balfour, and W. Dally. Flattened butterfly topology for on-chip networks. MICRO-40, 2007. Google ScholarDigital Library
- J. Kim, C. Nicopoulos, D. Park, V. Narayanan, M. S. Yousif, and C. R. Das. A gracefully degrading and energy-efficient modular router architecture for on-chip networks. In Proceedings of the 33rd Annual International Symposium on Computer Architecture, ISCA '06, pages 4--15, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
- A. Kumary, P. Kundu, A. Singh, L.-S. Peh, and N. Jha. A 4.6 tbits/s 3.6 ghz single-cycle noc router with a novel switch allocator in 65nm cmos. In ICCD-25, 2007.Google Scholar
- N. McKeown. The islip scheduling algorithm for input-queued switches. Networking, IEEE/ACM Transactions on, 1999. Google ScholarDigital Library
- G. Michelogiannakis, N. Jiang, D. U. Becker, and W. J. Dally. Packet chaining: efficient single-cycle allocation for on-chip networks. In MICRO-44, 2011. Google ScholarDigital Library
- S. S. Mukherjee, F. Silla, P. Bannon, J. Emer, S. Lang, and D. Webb. A comparative study of arbitration algorithms for the alpha 21364 pipelined router. ACM SIGARCH Computer Architecture News, 2002. Google ScholarDigital Library
- R. Mullins, A. West, and S. Moore. Low-latency virtual-channel routers for on-chip networks. In ISCA '04: Proceedings of the 31st Annual International Symposium on Computer Architecture, 2004. Google ScholarDigital Library
- C. A. Nicopoulos, D. Park, J. Kim, N. Vijaykrishnan, M. S. Yousif, and C. R. Das. Vichar: A dynamic virtual channel regulator for network-on-chip routers. In MICRO-39, 2006. Google ScholarDigital Library
- L.-S. Peh and W. J. Dally. A delay model and speculative architecture for pipelined routers. In HPCA-7, 2001. Google ScholarDigital Library
- S. Satpathy, K. Sewell, T. Manville, Y.-P. Chen, R. G. Dreslinski, D. Sylvester, T. N. Mudge, and D. Blaauw. A 4.5tb/s 3.4tb/s/w 64x64 switch fabric with self-updating least recently granted priority and quality of service arbitration in 45nm cmos. In ISSCC, 2012.Google Scholar
- Y. Tamir and H.-C. Chi. Symmetric crossbar arbiters for vlsi communication switches. Parallel and Distributed Systems, IEEE Transactions on, 1993. Google ScholarDigital Library
- D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J. F. B. III, and A. Agarwal. On-chip interconnection architecture of the tile processor. IEEE Micro, 2007. Google ScholarDigital Library
Index Terms
- VIX: Virtual Input Crossbar for Efficient Switch Allocation
Recommendations
[2010] VIX: A Router Architecture for Priority-Aware Networks-on-Chip
IWIA '10: Proceedings of the 2010 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and SystemsIn future many-core chip multiprocessors (CMPs) and systems-on-chips (SoCs) architectures, networks-on-chip (NoC) will be one of the most critical components. In CMPs and SoCs, multiple applications will be executed concurrently and they interfere each ...
PMCNOC: A Pipelining Multi-channel Central Caching Network-on-chip Communication Architecture Design
With the de facto transformation of technology into nano-technology, more and more functional components can be embedded on a single silicon die, thus enabling high degree pipelining operations such as those required for multimedia applications. In ...
Design of a High-Throughput NoC Router with Neighbor Flow Regulation
HPCC '12: Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and SystemsThe throughput of a Network-on-Chip (NoC) mainly depends on the specific router microarchitecture design. Besides, coordination among routers also plays an important role in the performance of NoC, since router behaviors are closely related, especially ...
Comments