Abstract
This article describes several multiplexer-based interconnection strategies designed to improve energy consumption of stripe-based coarse-grain reconfigurable fabrics. Application requirements for the architecture as well as two dense subgraphs are extracted from a suite of signal and image processing benchmarks. These statistics are used to drive the strategy of the composition of multiplexer-based interconnect. The article compares interconnects that are fully connected between stripes, those with a cardinality of 8:1 to 4:1, and extensions that provide a 5:1 cardinality, limited 6:1 cardinality, and hybrids between 5:1 and 3:1 cardinalities. Additionally, dedicated vertical routes are considered replacing some computational units with dedicated pass-gates. Using a fabric interconnect model (FIM) written in XML, we demonstrate that fabric instances and mappers can be automatically generated using a Web-based design flow. Upon testing these instances, we found that using an 8:1 cardinality interconnect with 33% of the computational units replaced with dedicated pass-gates provided the best energy versus mappability tradeoff, resulting in a 50% energy improvement over fully connected rows and 20% energy improvement over an 8:1 cardinality interconnect without dedicated vertical routes.
- Aggarwal, A. A. and Lewis, D. M. 1994. Routing architectures for hierarchical field programmable gate arrays. In Proceedings of the IEEE International Conference on Computer Design. Google ScholarDigital Library
- Baz, M., Hunsaker, B., Mehta, G., Stander, J., and Jones, A. K. 2007. Mapping and design of a hardware fabric. Tech. rep. 07-1, University of Pittsburgh Department of Industrial Engineering.Google Scholar
- Baz, M., Hunsaker, B., Mehta, G., Stander, J., and Jones, A. K. 2008. Application mapping onto a coarse-grained computational device. Europ. Jour. Operat. Resear. To appear.Google Scholar
- Benoit, P., Sassatelli, G., Torres, L., Demigny, D., Robert, M., and Cambon, G. 2003. Metrics for reconfigurable architectures characterization: Remanence and scalability. In Proceedings of the Reconfigurable Architecture Workshop.Google Scholar
- Bilavarn, S., Gogniat, G., Philippe, J. L., and Bossuet, L. 2003. Fast prototyping of reconfigurable architectures from a C program. In Proceedings of the IEEE Symposium on Circuits and Systems.Google Scholar
- Bossuet, L., Gogniat, G., and Philippe, J.-L. 2005. Generic design space exploration for reconfigurable architectures. In Proceedings of the Reconfigurable Architectures Workshop (RAW).Google Scholar
- Bray, T., Paoli, J., C. M. Sperberg-McQueen, E. M., and Yergeau, F. 2006. Extensible markup language (xml) 1.0 (fourth edition)—origin and goals. Tech. rep. 20060816, World Wide Web Consortium.Google Scholar
- Brisk, P., Verma, A. K., and Ienne, P. 2007. Optimal polynomial-time interprocedural register allocation for high-level synthesis and asip design. In Proceedings of the International Conference on Computer-Aided Design (CCAD). IEEE Press, 172--179. Google ScholarDigital Library
- Cong, J., Fan, Y., Han, G., and Zhang, Z. 2004. Application-specific instruction generation for configurable processor architectures. In Proceedings of the International Symposium on Field Programmable Gate Arrays (ISFPGA). ACM, New York, 183--189. Google ScholarDigital Library
- Dinh, Q., Chen, D., and Wong, M. D. F. 2008. Efficient asip design for configurable processors with fine-grained resource sharing. In Proceedings of the International Symposium on Field Programmable Gate Arrays (ISFPGA). ACM, New York, 99--106. Google ScholarDigital Library
- Ebeling, C. et al. 1997. Mapping applications to the rapid configurable architecture. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines. Google ScholarDigital Library
- Ebeling, C., Cronquist, D. C., and Franklin, P. 1996. Rapid—reconfigurable pipelined datapath. In Proceedings of the 6th International Workshop on Field-Programmable Logic and Applications. Google ScholarDigital Library
- Enzler, R., Jeger, T., D. Cottet, and Troster, G. 2000. High-level area and performance estimation of hardware building blocks on FPGAs. In Proceedings of the Field-Programmable Logic and Applications Forum on Design Language. Google ScholarDigital Library
- Fanucci, L., Cassiano, M., Saponara, S., Kammler, D., Witte, E. M., Schliebusch, O., Ascheid, G., Leupers, R., and Meyr, H. 2006. Asip design and synthesis for nonlinear filtering in image processing. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE). European Design and Automation Association, 3001 Belgium, 233--238. Google ScholarDigital Library
- Gansner, E. R. and North, S. C. 2000. An open graph visualization system and its applications to software engineering. Soft.—Prac. Exper. 30, 11, 1203--1233. Google ScholarDigital Library
- Gonzalez, R. E. 2000. Xtensa—a Configurable and Extensible processor. IEEE Micro 20, 2, 60--70. Google ScholarDigital Library
- H. Singh, e. a. 1998. Morphosys: An integrated re-configurable architecture. In Proceedings of the NATO RTO Symposium on System Concepts and Integration.Google Scholar
- Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 1997. The chimaera reconfigurable functional unit. In IEEE Symposium on FPGAs for Custom Computing Machines (FCCM). 87--96. Google ScholarDigital Library
- Hauser, J. R. and Wawrzynek, J. 1997. Garp: A MIPS processor with a reconfigurable coprocessor. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, K. L. Pocek and J. Arnold, Eds. IEEE Computer Society Press, 12--21. Google ScholarDigital Library
- Hoare, R., Jones, A. K., Kusic, D., Fazekas, J., Foster, J., Tung, S., and McCloud, M. 2006. Rapid VLIW processor customization for signal processing applications using combinational hardware functions. EURASIP J. Appli. Signal Proc., Article ID 46472. Google ScholarDigital Library
- Jain, M. K., Balakrishnan, M., and Kumar, A. 2001. Asip design methodologies: Survey and issues. In Proceedings of the International Conference on VLSI Design. Google ScholarDigital Library
- Jones, A. K., Hoare, R., Kusic, D., Fazekas, J., and Foster, J. 2005. An FPGA-based VLIW processor with custom hardware execution. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays (FPGA). 107--117. Google ScholarDigital Library
- Jones, A. K., Hoare, R., Kusic, D., Mehta, G., Fazekas, J., and Foster, J. 2006. Reducing power while increasing performance with supercisc. ACM Tran. Embed. Comput. Syst. 5, 3, 1--29. Google ScholarDigital Library
- Kaviani, A., Vranesic, D., and Brown, S. 1998. Computational field programmable architecture. In Proceedings of the IEEE Custom Integrated Circuits Conference.Google Scholar
- Levine, B. 2005. Haste: Hybrid architectures with a single transformable executable, Ph.D. dissertation, Department of Electrical and Computer Engineering, Carnegie Mellon University. http://www.ece.cmu.edu/~blevine/pubs.htm.Google Scholar
- Levine, B. and Schmit, H. 2002. Piperench: Power & performance evaluation of a programmable pipelined datapath. Hot Chips 14, Palo Alto, CA.Google Scholar
- Liu, X. and Papaefthymiou, M. C. 2004. A markov chain sequence generator for power macromodeling. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. (TCAD).Google ScholarDigital Library
- Mehta, G., Hoare, R. R., Stander, J., and Jones, A. K. 2006. Design space exploration for low-power reconfigurable fabrics. In Proceedings of the IPDPS Reconfigurable Architectures Workshop (RAW).Google Scholar
- Mehta, G., Ihrig, C. J., and Jones, A. K. 2008. Reducing energy by exploring heterogeneity in a coarse-grain fabric. In Proceedings of the IPDPS Reconfigurable Architecture Workshop (RAW).Google Scholar
- Mehta, G., Stander, J., Baz, M., Hunsaker, B., and Jones, A. K. 2007. Interconnect customization for a coarse-grained reconfigurable fabric. In Proceedings of the IPDPS Reconfigurable Architecture Workshop (RAW). 165.1--165.8.Google Scholar
- Mehta, G., Stander, J., Lucas, J., Hoare, R. R., Hunsaker, B., and Jones, A. K. 2006. A low-energy reconfigurable fabric for the supercisc architecture. J. Low Power Electro. 2, 2, 148--164.Google ScholarCross Ref
- Micheli, G. D. 1994. Synthesis and Optimizaton of Digital Circuits. McGraw-Hill Inc. Google ScholarDigital Library
- Mirsky, E. and Dehon, A. 1996. Matrix: A reconfigurable computing architecture with configurable instruction distribution and deployable resources. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines.Google Scholar
- Schmit, H., Whelihan, D., Tsai, A., Moe, M., Levine, B., and Taylor, R. R. 2002. Piperench: A virtualized programmable datapath in 0.18 micron technolog. In Proceedings of the IEEE Custom Integrated Circuits Conference.Google Scholar
- Shen, Z., He, H., Zhang, Y., and Sun, Y. 2007. A video specific instruction set architecture for asip design. In Proceedings of the International Conference on VLSI Design 2007, 2, 1--7.Google Scholar
- Sheng, L., Kaviani, A. S., and Bathala, K. 2002. Dynamic power consumption in virtex-II FPGA family. In Proceedings of the ACM International Symposium on Field-Programme GateArrays (FPGA). Google ScholarDigital Library
- Wirthlin, M. J. and Hutchings, B. L. 1995. A dynamic instruction set computer. In Proceedings of the Symposium on FPGAs for Custom Computing Machines. 99--107. Google ScholarDigital Library
Index Terms
- Interconnect customization for a hardware fabric
Recommendations
Low-Level Flexible Architecture with Hybrid Reconfiguration for Evolvable Hardware
Field-programmable gate arrays (FPGAs) can be considered to be the most popular and successful platform for evolvable hardware. They allow one to establish and later reconfigure candidate solutions. Recent work in the field of evolvable hardware ...
Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect
ISC 2014: Proceedings of the 29th International Conference on Supercomputing - Volume 8488The Tofu Interconnect 2 Tofu2 is a system interconnect designed for the Fujitsu's next generation successor to the PRIMEHPC FX10 supercomputer. Tofu2 inherited the 6-dimensional mesh/torus network topology from its predecessor, and it increases the link ...
Design and implementation of a reconfigurable arbiter
SSIP'07: Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image ProcessingThe SOC design paradigm relies on well-defined interfaces and reuse of intellectual property (IP). Because more and more IPs are integrated into the design platform, the amount of communication between the IPs is on the increase and becomes the source ...
Comments