skip to main content
research-article

SoCDAL: System-on-chip design AcceLerator

Published: 06 February 2008 Publication History

Abstract

Time-to-market pressure and the ever-growing design complexity of multiprocessor system-on-chips have demanded an efficient design environment that enables fast exploration of large design space. In this article, we introduce a new design environment, called SoCDAL, for accelerating multiprocessor system-on-chip design through fast design-space exploration targeting real-time multimedia systems. SoCDAL is a set of mostly automated tools covering system specification, hardware/software estimation, application-to-architecture mapping, simulation model generation, and system verification through simulation. For system specification, the process network model has been widely used for system specification because of its modeling capability. However, it is hard to use for real-time systems design, since its behavior cannot be estimated statically. We introduce a new approach which enables analyzing a process network model statically with some restrictions. For the hardware/software estimation, we analyze codes statically. Application-to-architecture mapping process implements a novel algorithm to support an arbitrary number of processors, with performance evaluation by static scheduling considering communication behavior. Mapping results are used to generate simulation models automatically at several transaction levels to be pipelined to a commercial tool. We show the effectiveness of our approaches by some experimental results with multimedia applications such as JPEG, H.263, and H.264 encoders, as well as an H.264 decoder.

References

[1]
Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., and Sangiovanni-Vincentelli, A. 2003. Metropolis: An integrated electronic system design environment. Comput. 36, 45--52.
[2]
Basten, T. and Hoogerbrugge, J. 2001. Efficient execution of process networks. In Communication Process Architectures, A. Chalmers et al., Eds. IOS Press, Bristol, UK, 1--14.
[3]
Benvenisite, A. and Berry, G. 1991. The synchronous approach to reactive and real-time systems. In Proc. IEEE (Sept.), 1270--1282.
[4]
Catapult C Synthesis. 2005. C-based design. http://www.mentor.com/products/c-based_design.
[5]
CDFG. 1998. Control data flow graph toolset. http://poppy.snu.ac.kr/CDFG.
[6]
Chiodo, M., Giusto, P., Hsieh, H., Jurecska, A., Lavagno, L., and Sangiovanni-Vincentelli, A. 1994. Hardware-Software codesign of embedded systems. IEEE Micro. 14, 26--36.
[7]
ConvergenSC. 2004. ConvergenSC/Incisive design flow. http://www.coware.com.
[8]
Davis II, J., Hylands, C., Kienhuis, B., Lee, A., Liu, J., Liu, X., Muliadi, L., Neuendorffer, S., Tsay, J., Vogel, B., and Xiong, Y. 2001. Ptolemy II---Heterogeneous concurrent modeling and design in Java. Tech. Mem. M01/12J, UCB/ERL, Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA.
[9]
Dick, R. P. and Jha, N. K. 1998. MOGAC: A multiobjective genetic algorithm for hardware-software co-synthesis of distributed embedded systems. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 17, 920--935.
[10]
Dwivedi, B. K., Kumar, A., and Balakrishnan, M. 2003. Synthesis of application specific multiprocessor architectures for process networks. Tech. Rep., Department of Computer Science and Engineering, Indian Institute of Technology, Delhi, India.
[11]
Engels, D. and Devadas, S. 2000. A new approach to solving the hardware-software partitioning problem in embedded system design. In Proceedings of the 13th Symposium on Integrated Circuits ad Systems Design, Manaus, Brazil, 275--280.
[12]
Ernst, R. and Ye, W. 1997. Embedded program timing analysis based on path clustering and architecture classification. In Proceeding of the International Conference on Computer-Aided Design, San Jose, CA, 598--604.
[13]
Girault, A., Lee, B., and Lee, E. A. 1999. Hierarchical finite state machines with multiple concurrency models. IEEE Trans. Comput.-Aided Des. Integ. Circ. Syst. 18, 742--760.
[14]
GLPK. 1996. GNU linear programming kit. http://www.gnu.org/SW/glpk/glpk.html.
[15]
Gupta, S., Dutt, N., Gupta, R., and Nicolau, A. 2003. SPARK: A high-level synthesis framework for applying parallelizing computer transformations. In Proceedings of the 16th International Conference on VLSI Design, New Delhi, India, 461--466.
[16]
Ha, S. and Lee, E. A. 1997. Compile-Time scheduling of dynamic constructs in dataflow program graphs. IEEE Trans. Comput. 46 (Jul.), 768--778.
[17]
Hamann, A., Henia, R., Racu, R., Jersak, M., Richter, K., and Ernst, R. 2004. SymTA/S---Symbolic timing analysis for systems. In WIP Proceedings of the Euromicro Conference on Real-Time Systems, Catania, Italy, 17--20.
[18]
Han, K. and Kim, J. 2004. Quantum-Inspired evolutionary algorithms with a new termination criterion, Hϵ gate, and two phase scheme. IEEE Trans. Evol. Comput. 8, 156--169.
[19]
Han, K. and Kim, J. 2002. Quantum-Inspired evolutionary algorithm for a class of combinatorial optimization. IEEE Trans. Evol. Comput. 6, 580--593.
[20]
Harel, D. 1987. Statecharts: A visual formalism for complex systems. Sci. Comput. Program. 8, 231--274.
[21]
Hou, J. and Wolf, W. 1996. Process partitioning for distributed embedded systems. In Proceedings of the International Workshop on Hardware-Software Codesign, Pittsburgh, PA, 70--76.
[22]
Jantsch, A. and Sander, I. 2005. Models of computation and languages for embedded system design. IEE Proc. Comput. Digital Tech. 152, 114--129.
[23]
Jeon, J., Kim, D., Shin. D., and Choi, K. 2001. High-Level synthesis under multi-cycle interconnect delay. In Proceedings of the Asia and South Pacific Design Automation Conference, Yokohama, Japan, 662--667.
[24]
Kahn, G. 1974. The semantics of a simple language for parallel programming. In Proceedings of the IFIP Congress. The Netherlands, North-Holland, Amsterdam, 471--475.
[25]
Kim, S., Im, C., and Ha, S. 2003. Schedule-Aware performance estimation of communication architecture for efficient design space exploration. IEEE Trans. Very Large Scale Integr. Syst. 13, 539--552.
[26]
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Sci. 220, 671--680.
[27]
Knudsen, P. V. and Madsen, J. 1998. Communication estimation for hardware/software codesign. In Proceedings of the Hardware/Software Codesign, Seattle, WA, 55--59.
[28]
Lahiri, K., Raghunathan, A., and Dey, S. 1999. Fast performance analysis of bus-based system-on-chip communication architectures. In Proceedings of the International Conference on Computer-Aided Design, San Joes, CA, 566--572.
[29]
Lee, E. A. and Messerschmitt, D. G. 1987. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. C-36, 24--35.
[30]
Li, Y. S., Malik, S., and Wolfe, A. 1995. Performance estimation of embedded software with instruction cache modeling. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, Santa Clara, CA, 380--387.
[31]
Lieverse, P., Stefanov, T., Wolf, Van, D., and Deprettere, E. F. 2001. System level design with Spade: An M-JPEG case study. In Proceedings of the International Conference on Computer Aided Design, San Joes, CA, 31--38.
[32]
Liu, X., Liu, J., Eker, J., and Lee, E. A. 2002. Heterogeneous modeling and design of control systems. In Proceedings of the Conference on Software-Enabled Control: Information Technology for Dynamical Systems, T. Samad and G. Balas, Eds. Wiley-IEEE Press, New York, 105--122.
[33]
MaxSim. 1998. Arm RealView MaxSim. http://www.arm.com.
[34]
Micheli, G. D. and Gupta, R. K. 1997. Hardware/Software co-design. Proc. IEEE, 349--365.
[35]
MOTIF. 1995. Open Motif. http://www.opengroup.org/openmotif.
[36]
Oh, H. and Ha, S. 1999. A hardware-software cosynthesis technique based on heterogeneous multiprocessor scheduling. In Proceedings of the 7th International Workshop on Hardware/Software Codesign, San Diego, CA, 183--187.
[37]
Pankert, M., Mauss, O., Ritz, S., and Meyr, H. 1994. Dynamic data flow and control flow in high level DSP code synthesis. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Aachen, Germany, 449--452.
[38]
Pasricha, S. 2002. Transaction level modeling of SoC using SystemC 2.0. In Synopsys User Group Conference, Bangalore, India. May.
[39]
Pasricha, S. and Dutt, N. 2006. Constraint-Driven bus matrix synthesis for MPSoC. InProceedings of the Asia South Pacific Design Automation Conference, Yokohama, Japan, 30--35.
[40]
Pasricha, S., Ben-Romdhane, M., and Dutt, N. 2005. Using TLM for exploring bus-based SoC communication architectures. In Proceedings of the Conference on Application-Specific Systems and Architecture Processors, Samos, Greece, 79--85.
[41]
Patel, H. D. and Shukla, S. K. 2004. SystemC Kernel Extensions for Heterogeneous System Modeling: A Framework for Multi-MoC Modeling and Simulation. Kluwer Academic, Norwell, MA.
[42]
Peace. 2006. PeaCE codesign environment. http://peace.snu.ac.kr/research/peace.
[43]
Pimentel, A. D. and Erbas, C. 2006. A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Trans. Comput. 55, 99--112.
[44]
Pimentel, A. D., Lieverse, P., Van der Wolf, P., Hertzberger, L. O., and Deprettee, E. F. 2001. Exploring embedded-systems architectures with Artemis. Comput. 34, 57--63.
[45]
Prakash, S. and Parker, A. 1992. SOS: Synthesis of application-specific heterogeneous multiprocessor systems. J. Parallel Distrib. Comput. 16, 338--351.
[46]
Printz, H. 1991. Automatic mapping of large signal processing systems to a parallel machine. Tech. Memo. CMU-CS-91-101, School of Computer Science, Carnegie-Mellon University.
[47]
Sgroi, M., Lavagno, L., Watanabe, Y., and Sangiovanni-Vincentelli, A. 1999. Synthesis of embedded software using free-choice petri nets. In Proceedings of the Design Automation Conference, New Orleans, LA, 805--810.
[48]
Shor, P. W. 1998. Quantum computing. Documenta Mathematica, Extra Volume ICM, 467--486.
[49]
SUIF1. 1996. The SUIF 1.x compiler system. http://suif.stanford.edu/suif/suif1/index.html.
[50]
SystemC. 2005. Open SystemC Initiative. http://www. systemc.org.
[51]
TEAK. 2003. CEVA DSP cores. http://www.ceva-dsp.com.
[52]
Vanmeerbeeck, G., Schaumont, P., Vernalde, S., Engels, M., and Boisens, I. 2001. Hardware/Software partitioning of embedded systems in OCAPI-XL. In Proceedings of the 9th International Symposium on Hardware/Software Co-Design, Copenhagen, Denmark, 30--35.
[53]
Yen, T.--Y. 1996. Hardware-Software cosynthesis of distributed embedded systems. Ph.D. dissertation, Department Princeton University, Princeton, NJ.
[54]
Yoo. J, Feng, X., Choi, K., Chung, E., and Choi, K. 2006. Worst case execution time analysis for synthesized hardware. In Proceedings of the Asia and South Pacific Design Automation Conference, Yokohama, Japan, 905--910.

Cited By

View all
  • (2017)Autonomic Diffusive Load Balancing on Many-Core Architecture Using Simulated AnnealingIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.E100.A.1640E100.A:8(1640-1649)Online publication date: 2017
  • (2014)Design Exploration Methodology for Microprocessor and HW AcceleratorsScalable and Near-Optimal Design Space Exploration for Embedded Systems10.1007/978-3-319-04942-7_9(231-260)Online publication date: 21-Feb-2014
  • (2014)Introduction and MotivationScalable and Near-Optimal Design Space Exploration for Embedded Systems10.1007/978-3-319-04942-7_1(1-11)Online publication date: 21-Feb-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 13, Issue 1
January 2008
496 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/1297666
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 06 February 2008
Accepted: 01 July 2007
Received: 01 April 2007
Published in TODAES Volume 13, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Codesign
  2. application-to-architecture mapping
  3. design-space exploration
  4. multiprocessor system-on-chip
  5. process networks
  6. scheduling
  7. simulation
  8. specification
  9. static hardware/software estimation
  10. synchronous dataflow
  11. transaction-level model
  12. worst-case execution time

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)3
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Autonomic Diffusive Load Balancing on Many-Core Architecture Using Simulated AnnealingIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.E100.A.1640E100.A:8(1640-1649)Online publication date: 2017
  • (2014)Design Exploration Methodology for Microprocessor and HW AcceleratorsScalable and Near-Optimal Design Space Exploration for Embedded Systems10.1007/978-3-319-04942-7_9(231-260)Online publication date: 21-Feb-2014
  • (2014)Introduction and MotivationScalable and Near-Optimal Design Space Exploration for Embedded Systems10.1007/978-3-319-04942-7_1(1-11)Online publication date: 21-Feb-2014
  • (2013)Estado comparativo de las masas de Pinus uncinata Ram. potencialmente protectoras frente a aludes de una zona de Andorra y CataluñaPirineos10.3989/Pirineos.2013.168003168(39-57)Online publication date: 30-May-2013
  • (2013)Mapping on multi/many-core systemsProceedings of the 50th Annual Design Automation Conference10.1145/2463209.2488734(1-10)Online publication date: 29-May-2013
  • (2013)Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput ConstraintsACM Transactions on Architecture and Code Optimization10.1145/2459316.245931710:2(1-25)Online publication date: 1-May-2013
  • (2013)MultiMaKeACM Transactions on Embedded Computing Systems10.1145/2435227.243525512:1s(1-25)Online publication date: 29-Mar-2013
  • (2013)Accelerating throughput-aware runtime mapping for heterogeneous MPSoCsACM Transactions on Design Automation of Electronic Systems10.1145/2390191.239020018:1(1-29)Online publication date: 16-Jan-2013
  • (2013)Mapping and Scheduling of Tasks and Communications on Many-Core SoC Under Local Memory ConstraintIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2013.226640532:11(1748-1761)Online publication date: 1-Nov-2013
  • (2013)Incorporating Energy and Throughput Awareness in Design Space Exploration and Run-Time Mapping for Heterogeneous MPSoCsProceedings of the 2013 Euromicro Conference on Digital System Design10.1109/DSD.2013.61(513-521)Online publication date: 4-Sep-2013
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media