skip to main content
research-article
Open Access

Tomahawk: Parallelism and heterogeneity in communications signal processing MPSoCs

Published:28 March 2014Publication History
Skip Abstract Section

Abstract

Heterogeneity and parallelism in MPSoCs for 4G (and beyond) communications signal processing are inevitable in order to meet stringent power constraints and performance requirements. The question arises on how to cope with the problem of system programmability and runtime management incurred by the statically or even dynamically varying number and type of processing elements. This work addresses this challenge by proposing the concept of a heterogeneous many-core platform called Tomahawk. Apart from the definition of the system architecture, in this approach a unified framework including a model of computation, a programming interface and a dedicated runtime management unit called CoreManager is proposed. The increase of system complexity in terms of application parallelism and number of resources may lead to a dramatic increase of the management costs, hence causing performance degradation. For this reason, the efficient implementation of the CoreManager becomes a major issue in system design. This work compares the performance and capabilities of various CoreManager HW/SW solutions, based on ASIC, RISC and ASIP paradigms. The results demonstrate that the proposed ASIP-based solution approaches the performance of the ASIC realization, while preserving the full flexibility of the software (RISC-based) implementation.

References

  1. O. Anjum, T. Ahonen, F. Garzia, J. Nurmi, C. Brunelli, and H. Berg. 2011. State of the art baseband DSP platforms for Software Defined Radio: A survey. EURASIP J. Wireless Comm. Networking.Google ScholarGoogle Scholar
  2. O. Arnold and G. Fettweis. 2010. Power aware heterogeneous MPSoC with dynamic task scheduling and increased data locality for multiple applications. In Proceedings of the X International Workshop on Systems, Architectures, MOdeling, and Simulation (SAMOS'10).Google ScholarGoogle Scholar
  3. O. Arnold and G. Fettweis. 2011a. Self-aware heterogeneous MPSoC with dynamic task scheduling for battery lifetime extension. In Proceedings of the Workshop on Computing in Heterogeneous, Autonomous ‘N’ Goal-oriented Environments (CHANGE'11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Arnold and G. Fettweis. 2011b. On the Impact of dynamic task scheduling in heterogeneous MPSoCs. In Proceedings of the International Conference on Embedded Computer Systems Architectures, Modeling and Simulation (SAMOS'11).Google ScholarGoogle Scholar
  5. O. Arnold and G. Fettweis. 2011c. Resilient dynamic task scheduling for unreliable heterogeneous MPSoCs. In Proceedings of the SCD Semiconductor Conference Dresden (SCD'11).Google ScholarGoogle Scholar
  6. O. Arnold, B. Nöthen, and G. Fettweis. 2012. Instruction set architecture extensions for a dynamic task scheduling unit. In Proceedings of the IEEE Annual Symposium on VLSI (ISVLSI'12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Asanovic, R. Bodik, B.C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D.A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick. 2006. The landscape of parallel computing research: A view from Berkeley. Tech. Rep. UCB/EECS-2006-183, University of California, Berkeley.Google ScholarGoogle Scholar
  8. P. Bellens, J. M. Perez, R. M. Badia, and J. Labarta. 2006. CellSs: A programming model for the Cell BE architecture. In Proceedings of the ACM/IEEE Supercomputing Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Berkel, F. Heinle, P. P. E. Meuwissen, K. Moerman, and M. Weiss. 2005. Vector processing as an enabler for software-defined radio in handheld devices. EURASIP J. Applied Signal Processing, 16, 2613--2625. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Bimberg, M. B. S. Tavares, E. Matus, and G. Fettweis. 2007. A high-throughput programmable decoder for LDPC convolutional codes. In Proceedings of the 18th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP'07).Google ScholarGoogle Scholar
  11. B. Bougard, B. De Sutter, S Rabou, D. Novo, O. Allam, S. Dupont, and L. Van der Perre. 2008. A coarse-grained array based baseband processor for 100Mbps+ software defined radio. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 716--721 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Cichon, P. Robelly, H. Seidel, E. Matus, M. Bronzel, and G. Fettweis. 2004. Synchronous transfer architecture (sta). In Proceedings of the 4th International Workshop on Systems, Architectures, Modeling, and Simulation (SAMOS'04). 126--130.Google ScholarGoogle Scholar
  13. K. Fatahalian, T. J. Knight, M. Houston, et al. 2006. Sequoia: Programming the memory hierarchy. In Proceedings of the IEEE Conference on Supercomputing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Frigo, C. E. Leiserson, and K. H. Randall. 1998. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Ghuloum, E. Sprangle, J. Fang, et al. 2007. Ct: A flexible parallel programming model for Tera-scale architectures. Tech Rep., Intel Corporation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Glossner, D. Iancu, M. Moudgill, G. Nacer, S. Jinturkar, S. Stanley, and M. Schulte. 2007. The sandbridge sb3011 platform. EURASIP J. Embed. Syst. 1, 16--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Horowitz and W. Dally. 2004. How scaling will change processor architecture. In Proceedings of the IEEE Solid-State Circuits Conference. 132--133.Google ScholarGoogle Scholar
  18. M. Keating, D. Flynn, R. Aitken, and K. Shi. 2007. Low Power Methodology Manual: For System-on-Chip Design. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Kneip, M. Weiss, W. Drescher, et al. 2002. Single chip programmable baseband ASSP for 5 GHz wireless LAN applications. IEICE Trans. Electronics E85-C, 2, 359--367.Google ScholarGoogle Scholar
  20. E. A. Lee and D. G. Messerschmitt. 1987. Synchronous data flow. Proc. IEEE 75, 9, 1235--1245.Google ScholarGoogle ScholarCross RefCross Ref
  21. T. Limberg, M. Winter, M. Bimberg, R. Klemm, E. Matus, M. B. Tavares, G. Fettweis, H. Ahlendorf, and P. Robelly. 2008. A fully programmable 40 GOPS SDR single chip baseband for LTE/WiMAX terminals. In Proceedings of the 34th European Solid-State Circuit Conference (ESSCIRC'08).Google ScholarGoogle Scholar
  22. Y. Lin, H. Lee, M. Who, et al. 2006. SODA: A low-power architecture for software radio. In Proceedings of the 33rd International Symposium on Computer Architecture (ISCA'06). 89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Liu, A. Nilsson, E. Tell, D. Wu, and J. Eilert. 2009. Bridging dream and reality: Programmable baseband processors for software-defined radio. IEEE Commun Mag. 47, 134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Nilsson, E. Tell, and D. Liu. 2009. An 11 mm, 70 mW fully programmable baseband processor for mobile WiMAX and DVB-T/H in 0.12 m CMOS. IEEE J. Solid-State Circuits 44, 1, 90--97.Google ScholarGoogle ScholarCross RefCross Ref
  25. D. Novo, W. Moffat, V. Derudder, and B. Bougard. 2005. Mapping a multiple antenna SDM-OFDM receiver on the ADRES coarse-grained reconfigurable processor. In Proceedings of the IEEE Workshop on Signal Processing Systems Design and Implementation. 473--478.Google ScholarGoogle Scholar
  26. U. Ramacher. 2007. Software-defined radio prospects for multistandard mobile phones. Computer 40, 10, 62--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tensilica. 2012. http://www.tensilica.com.Google ScholarGoogle Scholar
  28. M. Winter and G. Fettweis. 2006. Interconnection generation for system-on-chip design. In Proceedings of International Symposium on System-on-Chip, Tampere. 91--94.Google ScholarGoogle Scholar
  29. M. Winter and G. Fettweis. 2011. Guaranteed service virtual channel allocation in NoCs for run-time task scheduling. In Proceedings of the Design Automation and Test in Europe (DATE'11).Google ScholarGoogle Scholar

Index Terms

  1. Tomahawk: Parallelism and heterogeneity in communications signal processing MPSoCs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader