skip to main content
10.1145/3120895.3120916acmotherconferencesArticle/Chapter ViewAbstractPublication PagesheartConference Proceedingsconference-collections
research-article

FPGA Accelerated NoC-Simulation: A Case Study on the Intel Xeon Phi Ringbus Topology

Authors Info & Claims
Published:07 June 2017Publication History

ABSTRACT

Complex signal processing algorithms targeted on architectures with increasingly high numbers of parallel processing units require high performance core-interconnections (i.e., low latencies, high throughput, no pinch-offs or bottlenecks). Therefore, assisting techniques, exploring characteristics of diverse topologies of common as well as innovative Network-on-Chips (NoCs), are necessary for the development of chips with massive parallel processing cores. In contrast to analytic NoC models, event driven NoC simulations can handle even complex task graphs, but however feature long simulation times. Enabling the simulation of even complex task graphs, in this work, we propose to use FPGA accelerated simulation. While we extend such a simulator in order to imitate cache coherence communication-behavior, we also present a translation of real measured profiles to task graphs for in-depth simulation of the communication behavior of an existing NoC-based manycore. Therefore, this approach is able to not only deal with synthetic scenarios, but analyse the communication behavior of real world applications. Additionally, a simulation of the Histograms of Oriented Gradients algorithm, running on the Intel Xeon Phi manycore, exhibiting a 70-stop ring-bus, exemplifies this approach.

References

  1. D. Molka, D. Hackenberg, R. Schöne, and W. E. Nagel. Cache Coherence Protocol and Memory Performance of the Intel Haswell-EP Architecture. In Intl. Conf. Parallel Processing, pages 739--748. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. GR740: The ESA Next Generation Microprocessor (NGMP). http://microelectronics.esa.int/ngmp, 2017.Google ScholarGoogle Scholar
  3. W. J. Dally and B. Towles. Route packets, not wires: on-chip interconnection networks. In Design Automation Conf., pages 684--689, 2001. Google ScholarGoogle ScholarCross RefCross Ref
  4. A. Abbas, M. Ali, A. Fayyaz, A. Ghosh, A. Kalra, S. U. Khan, M. Usman S. Khan, T. De Menezes, S. Pattanayak, A. Sanyal, and S. Usman. A survey on energy-efficient methodologies and architectures of network-on-chip. Computers and Electrical Engineering, pages 333--347, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Dalal and B. Triggs. Histograms of Oriented Gradients for Human Detection. In Intl. Conf. Computer Vision and Pattern Recognition (CVPR), volume 1, pages 886--893. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Membarth, F. Hannig, J. Teich, M. Körner, and W. Eckert. Comparison of Parallelization Frameworks for Shared Memory Multi-Core Architectures. In Proc. Embedded World Conference, Nuremberg, Germany. IEEE, 2010.Google ScholarGoogle Scholar
  7. M. C. Neuenhahn, J. Schleifer, H. Blume, and T. G. Noll. Quantitative comparison of performance analysis techniques for modular and generic network-on-chip. Adv. Radio Science, 7(C. 4):107--112, 2009.Google ScholarGoogle Scholar
  8. N. Genko, D. Atienza, G. De Micheli, J. M. Mendias, R. Hermida, and F. Catthoor. A complete network-on-chip emulation framework. In Design, Automation and Test in Europe, pages 246--251 Vol. 1, March 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Eggenberger and M. Radetzki. Scalable parallel simulation of networks on chip. In 2013 Seventh IEEE/ACM International Symposium on Networks-on-Chip (NoCS), pages 1--8, April 2013. Google ScholarGoogle ScholarCross RefCross Ref
  10. A. Y. Weldezion, M. Grange, A. Jantsch, H. Tenhunen, and D. Pamunuwa. Zero-load predictive model for performance analysis in deflection routing NoCs. Microprocessors and Microsystems, 39(8):634--647, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Fischer, A. Fehske, and G. P. Fettweis. A Flexible Analytic Model for the Design Space Exploration of Many-Core Network-on-Chips Based on Queueing Theory. In Intl. Conf. Advances in System Simulation, ser. SIMUL, 2012.Google ScholarGoogle Scholar
  12. D. Pfefferkorn, A. Schmider, G. Payá-Vayá, M. Neuenhahn, and H. Blume. FNO-CEE: A Framework for NoC Evaluation by FPGA-based Emulation. In Intl. Conf. Embedded Computer Systems (SAMOS), pages 86--95, 2015.Google ScholarGoogle Scholar
  13. S. Chai, Y. Li, J. Wang, and C. Wu. A List Simulated Annealing Algorithm for Task Scheduling on Network-on-Chip. JCP, 9(1):176--182, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  14. E. Salminen, T. Kangas, J. Riihimaki, and T. D. Hamalainen. Requirements for Network-on-Nhip Benchmarking. In NORCHIP, pages 82--85, 2005.Google ScholarGoogle Scholar
  15. J. Xu, W. Wolf, J. Henkel, and S. Chakradhar. A Methodology for Design, Modeling, and Analysis of Networks-on-Chip. In Intl. Symp. Circuits and Systems, pages 1778--1781 Vol. 2. IEEE, 2005.Google ScholarGoogle Scholar
  16. O.J. Arndt, D. Becker, F. Giesemann, G. Payá-Vayá,C. Bartels, and H. Blume. Performance Evaluation of the Intel Xeon Phi Manycore Architecture Using Parallel Video-Based Driver Assistance Algorithms. In Intl. Conf. Embedded Computer Systems (SAMOS XIV), pages 125--132. IEEE, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  17. O. J. Arndt, T. Lefherz, and H. Blume. Abstracting Parallel Programming and Its Analysis Towards Framework Independent Development. In Intl. Symp. Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pages 96--103. IEEE, 2015.Google ScholarGoogle Scholar
  18. Intel Press Kit - Intel Xeon Phi Coprocessor 5110P/3000 Series. https://newsroom.intel.com/press-kits/intel-xeon-phi-coprocessor-5110p3000-series, 2012.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    HEART '17: Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies
    June 2017
    172 pages
    ISBN:9781450353168
    DOI:10.1145/3120895

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 7 June 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate22of50submissions,44%
  • Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader