research-article

Retargetable automatic generation of compound instructions for CGRA based reconfigurable processor applications

Authors:

Narasinga Rao Miniskar,

Donghoon YooAuthors Info & Claims

CASES '14: Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems

Article No.: 4, Pages 1 - 9

https://doi.org/10.1145/2656106.2656125

Published: 12 October 2014 Publication History

Abstract

Reconfigurable processors such as SRP (Samsung Reconfigurable Processors) have become increasingly important, which enables just enough flexibility of accepting software solutions and providing application specific hardware configurability for faster time-to-market, lower development cost and higher performance while maintaining lower energy consumption and area. The reconfigurable processor compilation framework supports wide range of architectures through architecture description template for different domains of applications such as image processing, multimedia, video, and graphics. These architectures support several domain specific compound instructions (also called as intrinsics), which are computationally efficient when compared to the set of general instructions in the processor. Application developers have to use these intrinsics in their programs according to the architecture, which can result very inefficient usage, tedious and more error-prone. Moreover, the intrinsics provided by the architecture need constant reference to the intrinsics file during development. In this paper, we propose a retargetable novel methodology for the automatic generation of compound instructions for a given architecture and application source code at compile time. Our approach is able to consider ~75% of total intrinsics in the architectures with the success rate of > 90% in identifying the intrinsics in the benchmarks such as AVC, OpenGL Full Engine and OpenGL Vector benchmarks.

References

[1]

Gnu gcc: http://gcc.gnu.org.

[2]

Opengl: http://www.opengl.org.

[3]

Joint video team of itu-t and iso/iec jtc 1, draft itu-t recommendation and final draft international standard of joint video specification (itu-t rec. h.264 --- iso/iec 14496-10 avc), document jvt-g050r1, 2003.

[4]

TIE-the fast path to high-performance embedded soc processing, 2009. Tensilica Tech Report, http://www.tensilica.com/hwlit

[5]

M. Bose and V. Rajagopala. Physics engine on reconfigurable processor - low power optimized solution empowering next-generation graphics on embedded platforms. In CGAMES, pages 138--142, 2012.

Digital Library

[6]

D. Burger, S. W. Keckler, K. S. McKinley, M. Dahlin, L. K. John, C. Lin, C. R. Moore, J. Burrill, R. G. McDonald, W. Yoder, and t. T. Team. Scaling to the end of silicon with edge architectures. Computer, 37(7):44--55, July 2004.

Digital Library

[7]

J. Choi, S. Kim, and H. Han. Accelerating loops for coarse grained reconfigurable architectures using instruction extensions. In Proceedings of the 2011 ACM Symposium on Research in Applied Computation, RACS '11, pages 314--318, New York, NY, USA, 2011. ACM.

Digital Library

[8]

K. E. Coons, X. Chen, D. Burger, K. S. McKinley, and S. K. Kushwaha. A spatial path scheduling algorithm for edge architectures. SIGARCH Comput. Archit. News, 34(5):129--140, Oct. 2006.

Digital Library

[9]

S. Friedman, A. Carroll, B. Van Essen, B. Ylvisaker, C. Ebeling, and S. Hauck. Spr: an architecture-adaptive cgra mapping tool. In Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, FPGA '09, pages 191--200, New York.

Digital Library

[10]

D. Goodwin and D. Petkov. Automatic generation of application specific processors. In Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, CASES '03, pages 137--147, New York.

Digital Library

[11]

Y. Huang, P. Ienne, O. Temam, Y. Chen, and C. Wu. Elastic cgras. In Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, FPGA '13, pages 171--180, New York.

Digital Library

[12]

H. P. Huynh, Y. Liang, and T. Mitra. Efficient custom instructions generation for system-level design. In Field-Programmable Technology (FPT), 2010 International Conference on, pages 445--448, 2010.

[13]

C. Jang, J. Kim, J. Lee, H.-S. Kim, D.-H. Yoo, S. Kim, H.-S. Kim, and S. Ryu. An instruction-scheduling-aware data partitioning technique for coarse-grained reconfigurable architectures. In Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems, LCTES '11, pages 151--160, New York.

Digital Library

[14]

J. Larrosa and G. Valiente. Constraint satisfaction algorithms for graph pattern matching. Mathematical. Structures in Comp. Sci., 12(4):403--422, Aug. 2002.

Digital Library

[15]

C. Lattner and V. Adve. The llvm compiler framework and infrastructure tutorial. In Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing, LCPC'04, pages 15--16, Berlin, Heidelberg, 2005. Springer-Verlag.

Digital Library

[16]

W. J. Lee, S.-O. Woo, K.-T. Kwon, S.-J. Son, K.-J. Min, S.-Y. Jung, C.-M. Park, and S.-H. Lee. A scalable gpu architecture based on dynamically reconfigurable embedded processor. In High Performance Graphics, Aug. 2011.

[17]

B. Mei, A. Lambrechts, J.-Y. Mignolet, D. Verkest, and R. Lauwereins. Architecture exploration for a reconfigurable architecture template. Design Test of Computers, IEEE, 22(2):90--101, march-april 2005.

Digital Library

[18]

R. Nagarajan, S. K. Kushwaha, D. Burger, K. S. McKinley, C. Lin, and S. W. Keckler. Static placement, dynamic issue (spdi) scheduling for edge architectures. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, PACT '04, pages 74--84, Washington, DC, USA, 2004.

Digital Library

[19]

Y. Park, H. Park, and S. A. Mahlke. Cgra express: accelerating execution using dynamic operation fusion. In CASES, pages 271--280, 2009.

Digital Library

[20]

Y. Park, J. J. K. Park, and S. A. Mahlke. Efficient performance scaling of future cgras for mobile applications. In FPT, pages 335--342, 2012.

[21]

J. L. Peterson. Petri nets. ACM Comput. Surv., 9(3):223--252, Sept. 1977. ISSN 0360-0300.

Digital Library

[22]

P. Raghavan, A. Lambrechts, J. Absar, M. Jayapala, F. Catthoor, and D. Verkest. Coffee: compiler framework for energy-aware exploration. In Proceedings of the 3rd international conference on High performance embedded architectures and compilers, HiPEAC'08, pages 193--208, Berlin, Heidelberg, 2008. Springer-Verlag.

Digital Library

[23]

B. R. Rau. Iterative modulo scheduling: an algorithm for software pipelining loops. In Proceedings of the 27th annual international symposium on Microarchitecture, MICRO 27, pages 63--74, New York, NY, USA, 1994. ACM.

Digital Library

[24]

A. Smith, J. Gibson, B. Maher, N. Nethercote, B. Yoder, D. Burger, K. S. McKinle, and J. Burrill. Compiling for edge architectures. In Proceedings of the International Symposium on Code Generation and Optimization, CGO '06, pages 185--195, Washington, DC, USA, 2006. IEEE Computer Society.

Digital Library

[25]

J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23 (1):31--42, Jan. 1976.

Digital Library

[26]

N. S. Voros, M. Hübner, J. Becker, M. Kühnle, F. Thomaitiv, A. Grasset, P. Brelet, P. Bonnot, F. Campi, E. Schüler, H. Sahlbach, S. Whitty, R. Ernst, E. Billich, C. Tischendorf, U. Heinkel, F. Ieromnimon, D. Kritharidis, A. Schneider, J. Knaeblein, and W. Putzke-Röming. Morpheus: A heterogeneous dynamically reconfigurable platform for designing highly complex embedded systems. ACM Trans. Embed. Comput. Syst., 12(3):70:1--70:33, Apr. 2013.

Digital Library

[27]

C. Wolinski and K. Kuchcinski. Identification of application specific instructions based on sub-graph isomorphism constraints. In Application -specific Systems, Architectures and Processors, 2007. ASAP. IEEE International Conf. on, pages 328--333, 2007.

Cited By

Paulino NFerreira JCardoso J(2017)Generation of Customized Accelerators for Loop Pipelining of Binary Instruction TracesIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.257364025:1(21-34)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1109/TVLSI.2016.2573640

Index Terms

Retargetable automatic generation of compound instructions for CGRA based reconfigurable processor applications

Recommendations

Retargetable code optimization with SIMD instructions
CODES+ISSS '06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis

Retargetable C compilers are nowadays widely used to quickly obtain compiler support for new embedded processors and to perform early processor architecture exploration. One frequent concern about retargetable compilers, though, is their lack of machine-...
Automatic generation of custom SIMD instructions for superword level parallelism
DATE '14: Proceedings of the conference on Design, Automation & Test in Europe

Application specific instruction-set processors (ASIPs) have drawn significant attention from System-on-a-Chip (SoC) community due to the capability of fine grain flexibility and customizability. In order to maximize the benefit of ASIP, automatic ...
Retargetable Code Generation Based on Structural Processor Description

Design automation for embedded systems comprising both hardware and software components demands for code generators integrated into electronic CAD systems. These code generators provide the necessary link between software synthesis tools in HW/SW ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CASES '14: Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems

October 2014

241 pages

ISBN:9781450330503

DOI:10.1145/2656106

General Chairs:
Karam S. Chatha
Qualcomm Research
,
Rolf Ernst
TU Braunschweig, Germany
,
Program Chairs:
Anand Raghunathan
Purdue University
,
Ravishankar Iyer
Intel

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGBED: ACM Special Interest Group on Embedded Systems
SIGDA: ACM Special Interest Group on Design Automation
IEEE CAS
IEEE Council on Electronic Design Automation (CEDA)
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ESWEEK'14

Sponsor:

ESWEEK'14: TENTH EMBEDDED SYSTEM WEEK

October 12 - 17, 2014

New Delhi, India

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
129
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Paulino NFerreira JCardoso J(2017)Generation of Customized Accelerators for Loop Pipelining of Binary Instruction TracesIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2016.257364025:1(21-34)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1109/TVLSI.2016.2573640

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten