Abstract
Nanophotonics is a promising solution for on-chip interconnection due to its intrinsic low-latency and low-power features, which can be useful for performance and energy in future Chip Multi-Processors (CMPs).
This article proposes a novel arbitrated all-optical path-setup scheme for tiled CMPs adopting circuit-switched optical networks. It aims at significantly reducing path-setup latency and overall energy consumption. The proposed arbitrated scheme is able to configure multiple photonic switches simultaneously, instead of sequentially as it is done in state-of-the-art proposals. The proposed fast optical path-setup solution reduces the overhead in each transmission and, most importantly, allows optical circuit-switched networks to effectively serve cache coherence traffic, which is mainly composed of relatively small messages.
Specifically, we propose a single-arbiter scheme where the whole topology is managed by a central module (single-arbiter) that takes care of the path-setup procedures. Then, to tackle scalability, we propose a logically clustered architecture (multi-arbiter) in which an arbiter is allocated in each logical core-cluster and an ad hoc distributed reservation protocol coordinates arbiters to manage inter-cluster path reservations.
We show that our proposed single-arbiter architecture outperforms a state-of-the-art optical network with sequential path-setup (optical baseline) in the case of 8- and 16-core tiled CMP setups. However, due to serialization issues, the single-arbiter solution is not able to compete with a reference electronic baseline for bigger 32- and 64-core setups even if still performing much better than the optical baseline. Conversely, our multi-arbiter hierarchical solution allows us to improve performance up to almost 20% and 40% for 32- and 64-core setups, respectively, demonstrating a wide applicability of the proposed technique.
Energy-wise, the analyzed solutions enable significant savings compared to both the optical baseline with sequential path setup, and to the electronic counterpart. Specifically, results show more than 25% average improvement for the single-arbiter in the 8- and 16-core cases, and more than 40% and 15% savings for the multi-arbiter in the 32- and 64-core cases, respectively.
- M. Badr and N. E. Jerger. 2014. SynFull: Synthetic traffic models capturing cache coherent behaviour. In Proceedings of the 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA’14). 109--120. Google ScholarDigital Library
- S. Bahirat and S. Pasricha. 2012. A particle swarm optimization approach for synthesizing application-specific hybrid photonic networks-on-chip. In Quality Electronic Design (ISQED’12). 78--83.Google Scholar
- Shirish Bahirat and Sudeep Pasricha. 2014. METEOR: Hybrid photonic ring-mesh network-on-chip for multicore architectures. ACM Trans. Embed. Comput. Syst. 13, 3s (March 2014), Article 116, 33 pages. Google ScholarDigital Library
- Sandro Bartolini and Paolo Grani. 2012. A simple on-chip optical interconnection for improving performance of coherency traffic in CMPs. In Proceedings of the 15th Euromicro Digital System Design Conference. Google ScholarDigital Library
- Sandro Bartolini, Luca Lusnig, and Enrico Martinelli. 2013. Olympic: A hierarchical all-optical photonic network for low-power chip multiprocessors. In DSD’13. 56--59. Google ScholarDigital Library
- C. Batten, A. Joshi, J. Orcutt, A. Khilo, B. Moss, C. Holzwarth, M. Popovic, Hanqing Li, Henry I. Smith, J. Hoyt, F. Kartner, R. Ram, V. Stojanovic, and K. Asanovic. 2008. Building manycore processor-to-DRAM networks with monolithic silicon photonics. In High Performance Interconnects (HOTI’08). 21--30. Google ScholarDigital Library
- Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08). ACM, New York, NY, 72--81. Google ScholarDigital Library
- N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, and S. K. Reinhardt. 2006. The M5 simulator: Modeling networked systems. IEEE Micro 26, 4 (2006), 52--60. Google ScholarDigital Library
- M. Brière, B. Girodias, Y. Bouchebaba, G. Nicolescu, F. Mieyeville, F. Gaffiot, and I. O’Connor. 2007. System level assessment of an optical NoC in an MPSoC platform. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). EDA Consortium, San Jose, CA, 1084--1089. Google ScholarDigital Library
- Jaime Cardenas, Carl B. Poitras, Jacob T. Robinson, Kyle Preston, Long Chen, and Michal Lipson. 2009. Low loss etchless silicon photonic waveguides. Opt. Express 17, 6 (Mar 2009), 4752--4757.Google ScholarCross Ref
- Ming-Hung Chang, Wei-Chih Hsieh, Pei-Chen Wu, Ching-Te Chuang, Kuan-Neng Chen, Chen-Chao Wang, Chun-Yen Ting, Kua-Hua Chen, Chi-Tsung Chiu, Ho-Ming Tong, and Wei Hwang. 2013. Multi-layer adaptive power management architecture for TSV 3DIC applications. Electronic Components and Technology Conference (2013).Google ScholarCross Ref
- F. G. de Magalhes, R. Priti, M. Nikdast, F. Hessel, O. Liboiron-Ladouceur, and G. Nicolescu. 2016. Design and modelling of a low-latency centralized controller for optical integrated networks. IEEE Commun. Lett. 20, 3 (March 2016), 462--465.Google Scholar
- Yigit Demir, Yan Pan, Seukwoo Song, Nikos Hardavellas, John Kim, and Gokhan Memik. 2014. Galaxy: A high-performance energy-efficient multi-chip architecture using photonic interconnects. In Proceedings of the 28th ACM International Conference on Supercomputing (ICS’14). ACM, New York, NY. Google ScholarDigital Library
- Paolo Grani, Robert Hendry, Sandro Bartolini, and Keren Bergman. 2015. Boosting multi-socket cache-coherency with low-latency silicon photonic interconnects. In Proceedings of the International Conference on Computing Networking and Communications (ICNC’15).Google ScholarCross Ref
- William M. Green, Min Yang, Solomon Assefa, Joris van Campenhout, Benjamin G. Lee, Christopher Jahnes, Fuad E. Doany, Clint Schow, Jeffrey A. Kash, and Yurii Vlasov. 2011. Silicon electro-optic 4x4 non-blocking switch array for on-chip photonic networks, In Proceedings of the Optical Fiber Communication Conference/National Fiber Optic Engineers Conference 2011, OThM1.Google ScholarCross Ref
- Huaxi Gu, Jiang Xu, and Zheng Wang. 2008. A novel optical mesh network-on-chip for gigascale systems-on-chip. In Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008 (APCCAS’08). 1728--1731.Google Scholar
- Huaxi Gu, Jiang Xu, and Wei Zhang. 2009. A low-power fat tree-based optical network-on-chip for multiprocessor system-on-chip. In Proceedings of the Design, Automation Test in Europe Conference Exhibition, 2009 (DATE’09). Google ScholarDigital Library
- Gilbert Hendry, Shoaib Kamil, Aleksandr Biberman, Johnnie Chan, Benjamin G. Lee, Marghoob Mohiyuddin, Ankit Jain, Keren Bergman, Luca P. Carloni, John Kubiatowicz, Leonid Oliker, and John Shalf. 2009. Analysis of photonic networks for a chip multiprocessor using scientific applications. In Proceedings of the 3rd ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC. Google ScholarDigital Library
- Robert Hendry, Dessislava Nikolova, Sebastien Rumley, Noam Ophir, and Keren Bergman. 2014. Physical layer analysis and modeling of silicon photonic WDM bus architectures. Invited paper. In Proceedings of the First International Workshop on Exploiting Silicon Photonics for Energy-efficient Heterogeneous Parallel Architectures (SiPhotonics’2014) (HIPEAC’14).Google Scholar
- Ruiqiang Ji, Lin Yang, Lei Zhang, Yonghui Tian, Jianfeng Ding, Hongtao Chen, Yangyang Lu, Ping Zhou, and Weiwei Zhu. 2011b. Five-port optical router for photonic networks-on-chip. Opt. Express 19 (Oct 2011), 20258--20268.Google Scholar
- Ruiqiang Ji, Lin Yang, Lei Zhang, Yonghui Tian, Jianfeng Ding, Hongtao Chen, Yangyang Lu, Ping Zhou, and Weiwei Zhu. 2011a. Microring-resonator-based four-port optical router for photonic networks-on-chip. Opt. Express 19 (Sep. 2011), 18945--18955.Google Scholar
- A. Joshi, C. Batten, Y. J. Kwon, S. Beamer, I. Shamim, K. Asanovic, and V. Stojanovic. 2009. Silicon-photonic clos networks for global on-chip communication. In Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip. 124--133. Google ScholarDigital Library
- A. B. Kahng, Bin Li, Li-Shiuan Peh, and K. Samadi. 2012. ORION 2.0: A power-area simulator for interconnection networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 20, 1 (2012), 191--196. Google ScholarDigital Library
- J. Kim, J. Balfour, and W. J. Dally. 2007. Flattened butterfly topology for on-chip networks. Comput. Archit. Let 6, 2 (Feb. 2007). Google ScholarDigital Library
- P. Koka, M. O. McCracken, H. Schwetman, C.-H. O. Chen, Xuezhe Zheng, R. Ho, K. Raj, and A. V. Krishnamoorthy. 2012. A micro-architectural analysis of switched photonic multi-chip interconnects. In Proceedings of the 2012 39th Annual International Symposium on Computer Architecture (ISCA’12). Google ScholarDigital Library
- S. Koohi, M. Abdollahi, and S. Hessabi. 2011. All-optical wavelength-routed NoC based on a novel hierarchical topology. In Proceedings of the 5th IEEE/ACM International Symposium on Networks on Chip (NoCS’11). 97--104. Google ScholarDigital Library
- A. V. Krishnamoorthy, K. W. Goossen, W. Jan, Xuezhe Zheng, R. Ho, Guoliang Li, R. Rozier, F. Liu, D. Patil, J. Lexau, H. Schwetman, Dazeng Feng, M. Asghari, T. Pinguet, and J. E. Cunningham. 2011. Progress in low-power switched optical interconnects. In IEEE Journal of Selected Topics in Quantum Electronics 17, 2 (2011), 357--376.Google ScholarCross Ref
- S. Le Beux, J. Trajkovic, I. O’Connor, G. Nicolescu, G. Bois, and P. Paulin. 2011. Optical ring network-on-chip (ORNoC): Architecture and design methodology. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’11).Google Scholar
- R. Morris, A. K. Kodi, and A. Louri. 2012. Dynamic reconfiguration of 3D photonic networks-on-chip for maximizing performance and improving fault tolerance. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). Google ScholarDigital Library
- Naveen Muralimanohar, Rajeev Balasubramonian, and Norman Jouppi. 2009. CACTI 6.0: A Tool to Model Large Caches. HP Laboratories. HPL-2009-85.Google Scholar
- C. Nitta, M. Farrens, and V. Akella. 2011. Addressing system-level trimming issues in on-chip nanophotonic networks. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’11). Google ScholarDigital Library
- Ian O’Connor, Dries Van Thourhout, and Alberto Scandurra. 2012. Wavelength division multiplexed photonic layer on CMOS. In Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop (INA-OCMC’12). ACM, New York, NY, 33--36. Google ScholarDigital Library
- Marta O. Obon, Ramini Luca, Vials-Yfera Vctor, and Davide Bertozzi. 2014. Capturing sensitivity of optical network quality metrics to its network interface parameters. Invited paper. In Proceedings of the 1st International Workshop on Exploiting Silicon Photonics for Energy-efficient Heterogeneous Parallel Architectures (SiPhotonics’14) (HIPEAC’14).Google Scholar
- Yan Pan, J. Kim, and G. Memik. 2010. FlexiShare: Channel sharing for an energy-efficient nanophotonic crossbar. In IEEE International Symposium on High Performance Computer Architecture (HPCA’10).Google Scholar
- Yan Pan, Prabhat Kumar, John Kim, Gokhan Memik, Yu Zhang, and Alok Choudhary. 2009. Firefly: Illuminating future network-on-chip with nanophotonics. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA'09). ACM, New York, 429--440. Google ScholarDigital Library
- Michele Petracca, Benjamin G. Lee, Keren Bergman, and Luca P. Carloni. 2008. Design exploration of optical interconnection networks for chip multiprocessors. In Proceedings of the 2008 16th IEEE Symposium on High Performance Interconnects. IEEE Computer Society, Washington, DC, 31--40. Google ScholarDigital Library
- Andrew W. Poon, Fang Xu, and Xianshu Luo. 2008. Cascaded active silicon microresonator array cross-connect circuits for WDM networks-on-chip. Proc. SPIE 6898 Silicon Photonics III (2008), 689812.Google ScholarCross Ref
- Ayse Yasemin Seydim. 1998. Wormhole Routing in Parallel Computers. (May 1998).Google Scholar
- A. Shacham, K. Bergman, and L. P. Carloni. 2008. Photonic networks-on-chip for future generations of chip multiprocessors. IEEE Transactions on Computers 57, 9 (2008), 1246--1260. Google ScholarDigital Library
- Kuanping Shang, Shibnath Pathak, Guangyao Liu, and S. J. B. Yoo. 2015. Ultra-low loss vertical optical couplers for 3D photonic integrated circuits. In Proceedings of the Optical Fiber Communication Conference. Th1F.6.Google ScholarCross Ref
- Nicolás Sherwood-Droz, Howard Wang, Long Chen, Benjamin G. Lee, Aleksandr Biberman, Keren Bergman, and Michal Lipson. 2008. Optical 4x4 hitless slicon router for optical networks-on-chip (NoC). Opt. Express 16 (2008), 15915--15922.Google ScholarCross Ref
- Chen Sun, Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason Miller, Anant Agarwal, Li-Shiuan Peh, and Vladimir Stojanovic. 2012. DSENT—A tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip. IEEE Computer Society, Washington, DC, 201--210. Google ScholarDigital Library
- Dana Vantrease, Robert Schreiber, Matteo Monchiero, Moray McLaren, Norman P. Jouppi, Marco Fiorentino, Al Davis, Nathan Binkert, Raymond G. Beausoleil, and Jung Ho Ahn. 2008. Corona: System implications of emerging nanophotonic technology. sigARCH Comp. Arch. News 36, 3 (2008), 153--164. Google ScholarDigital Library
- Xiaowen Wu, Jiang Xu, Yaoyao Ye, Zhehui Wang, Mahdi Nikdast, and Xuan Wang. 2014. SUOR: Sectioned undirectional optical ring for chip multiprocessor. J. Emerg. Technol. Comput. Syst. 10, 4 (June 2014), 25 pages. Google ScholarDigital Library
- Xiaowen Wu, Yaoyao Ye, Wei Zhang, Weichen Liu, M. Nikdast, Xuan Wang, and Jiang Xu. 2010. UNION: A unified inter/intra-chip optical network for chip multiprocessors. In Proceedings of the 2010 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH’10). 35--40. Google ScholarDigital Library
- Shijun Xiao, Maroof H. Khan, Hao Shen, and Minghao Qi. 2007. Multiple-channel silicon micro-resonator based filters for WDM applications. Opt. Express 15, 12 (June 2007), 7489--7498.Google Scholar
- Yule Xiong, Felipe Gohring de Magalhães, Gabriela Nicolescu, Fabiano Hessel, and Odile Liboiron-Ladouceur. 2017. Co-design of a low-latency centralized controller for silicon photonic multistage MZI-based switches, In Optical Fiber Communication Conference (2017), Th2A.37.Google ScholarCross Ref
- Qianfan Xu, Sasikanth Manipatruni, Brad Schmidt, Jagat Shakya, and Michal Lipson. 2007. 12.5 Gbit/s carrier-injection-based silicon micro-ring silicon modulators. Opt. Express 15, 2 (Jan. 2007), 430--436.Google ScholarCross Ref
- Yi Xu, Yu Du, Youtao Zhang, and Jun Yang. 2011. A composite and scalable cache coherence protocol for large scale CMPs. In Proceedings of the International Conference on Supercomputing (ICS’11). ACM. Google ScholarDigital Library
- Yaoyao Ye, Xiaowen Wu, Jiang Xu, Wei Zhang, M. Nikdast, and Xuan Wang. 2012a. Holistic comparison of optical routers for chip multiprocessors. In Proceedings of the 2012 International Conference on Anti-Counterfeiting, Security and Identification (ASID’12),1--5.Google ScholarCross Ref
- Yaoyao Ye, Jiang Xu, Xiaowen Wu, Wei Zhang, Weichen Liu, and Mahdi Nikdast. 2012b. A torus-based hierarchical optical-electronic network-on-chip for multiprocessor system-on-chip. J. Emerg. Technol. Comput. Syst. 8, 1 (February 2012), Article 5, 26 pages. Google ScholarDigital Library
- Xuezhe Zheng, Dinesh Patil, Jon Lexau, Frankie Liu, Guoliang Li, Hiren Thacker, Ying Luo, Ivan Shubin, Jieda Li, Jin Yao, Po Dong, Dazeng Feng, Mehdi Asghari, Thierry Pinguet, Attila Mekis, Philip Amberg, Michael Dayringer, Jon Gainsley, Hesam Fathi Moghadam, Elad Alon, Kannan Raj, Ron Ho, John E. Cunningham, and Ashok V. Krishnamoorthy. 2011. Ultra-efficient 10Gb/s hybrid integrated silicon photonic transmitter and receiver. Opt. Express 19, 6 (March 2011), 5172--5186.Google ScholarCross Ref
- Yunhui Zhu, Shenglin Ma, Xin Sun, Runiu Fang, Xiao Zhong, Yuan Bian, Meng Chen, Jing Chen, Min Miao, Wengao Lu, and Yufeng Jin. 2013. Development and characterization of a through-multilayer TSV integrated SRAM module. In Proceedings of the Electronic Components and Technology Conference.Google ScholarCross Ref
Index Terms
- Scalable Path-Setup Scheme for All-Optical Dynamic Circuit Switched NoCs in Cache Coherent CMPs
Recommendations
Design Options for Optical Ring Interconnect in Future Client Devices
Nanophotonic is a promising solution for on-chip interconnection due to its intrinsic low-latency and low-power features. Future tiled chip multiprocessors (CMPs) for rich client devices can receive energy benefits from this technology but we show that ...
Evaluation and design trade-offs between circuit-switched and packet-switched NOCs for application-specific SOCs
DAC '06: Proceedings of the 43rd annual Design Automation ConferenceNOC architectures have to deliver good latency-throughput performance in the face of very tight power and area budgets. However, the latency and the power consumption for transferring information down the transmitter stack, through the channel, and up ...
Co-tuning of a hybrid electronic-optical network for reducing energy consumption in embedded CMPs
MES '13: Proceedings of the First International Workshop on Many-core Embedded SystemsNanophotonic is a promising solution for on-chip interconnection due to its intrinsic low-latency and especially low-power features, desirable especially in future chip multiprocessors (CMPs) for rich client devices. In this paper we address the co-...
Comments