article

Spatial computation

Authors:

Girish Venkataramani,

Tiberiu Chelcea,

Seth Copen GoldsteinAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 32, Issue 5

Pages 14 - 26

https://doi.org/10.1145/1037947.1024396

Published: 07 October 2004 Publication History

Abstract

This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized control. SC circuits are optimized for wires at the expense of computation units.In this paper we investigate a particular implementation of SC: ASH (Application-Specific Hardware). Under the assumption that computation is cheaper than communication, ASH replicates computation units to simplify interconnect, building a system which uses very simple, completely dedicated communication channels. As a consequence, communication on the datapath never requires arbitration; the only arbitration required is for accessing memory. ASH relies on very simple hardware primitives, using no associative structures, no multiported register files, no scheduling logic, no broadcast, and no clocks. As a consequence, ASH hardware is fast and extremely power efficient.In this work we demonstrate three features of ASH: (1) that such architectures can be built by automatic compilation of C programs; (2) that distributed computation is in some respects fundamentally different from monolithic superscalar processors; and (3) that ASIC implementations of ASH use three orders of magnitude less energy compared to high-end superscalar processors, while being on average only 33% slower in performance (3.5x worst-case).

References

[1]

International technology roadmap for semiconductors (ITRS). http://public.itrs.net/Files/1999 SIA Roadmap/Design.pdf, 1999.]]

[2]

V. Agarwal, H.S. Murukkathampoondi, S.W. Keckler, and D.C. Burger. Clock rate versus IPC: The end of the road for conventional microarchitectures. In International Symposium on Computer Architecture (ISCA), June 2000.]]

Digital Library

[3]

Vicki H. Allan, Reese B. Jones, Randal M. Lee, and Stephen J. Allan. Software pipelining. ACM Computing Surveys, 27(3):367--432, September 1995.]]

Digital Library

[4]

Bharadwaj S Amrutur and Mark A Horowitz. Speed and power scaling of SRAMs. IEEE Journal of Solid State Circuits, 35(2):175--185, February 2000.]]

[5]

Andrew W. Appel. SSA is functional programming. ACM SIGPLAN Notices, April 1998.]]

Digital Library

[6]

Guido Arnout. C for system level design. In Design, Automation and Test in Europe (DATE), pages 384--387, Munich, Germany, March 1999.]]

Digital Library

[7]

Arvind and Robert A. Iannucci. A critique of multiprocessing von Neumann style. In International Symposium on Computer Architecture (ISCA), pages 426--436. IEEE Computer Society Press, 1983.]]

Digital Library

[8]

David I. August, Wen mei W. Hwu, and Scott A. Mahlke. A framework for balancing control flow and predication. In International Symposium on Computer Architecture (ISCA), December 1997.]]

Digital Library

[9]

Jonathan Babb, Martin Rinard, Csaba Andras Moritz, Walter Lee, Matthew Frank Rajeev Barua, and Saman Amarasinghe. Parallelizing applications into silicon. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 1999.]]

Digital Library

[10]

Daniel W. Bailey and Bradley J. Benschneider. Clocking design and analysis for a 600-MHz Alpha microprocessor. IEEE Journal of Solid-State Circuits, 33(11):1627, November 1998.]]

[11]

Micah Beck, Richard Johnson, and Keshav Pingali. From control flow to data flow. Journal of Parallel and Distributed Computing, 12:118--129, 1991.]]

Digital Library

[12]

Kees van Berkel and Martin Rem. VLSI programming of asynchronous circuits for low power. In Graham Birtwistle and Al Davis, editors, Asynchronous Digital Circuit Design, Workshops in Computing, pages 152--210. Springer Verlag, 1995. summary at www.cse.ttu.edu.tw/ cheng/courses/soc/S02/AsyncSoc08.ppt; also Nat.Lab. Technical Note Nr. UR 005/94, Philips Research Laboratories, Eindhoven, the Netherlands.]]

[13]

R. Brayton, A. Sangiovanni-Vincentelli, G. Hachtel, and C. McMullin. Logic Minimization Algorithms for Digital Circuits. Kluwer Academic Publishers, Boston, MA, 1984.]]

Digital Library

[14]

C.F. Brej and J.D. Garside. Early output logic using anti-tokens. In International Workshop on Logic Synthesis, pages 302--309, May 2003.]]

[15]

David Brooks, Vivek Tiwari, and Margaret Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In International Symposium on Computer Architecture (ISCA), pages 83--94. ACM Press, 2000.]]

Digital Library

[16]

Mihai Budiu. Spatial Computation. PhD thesis, Carnegie Mellon University, Computer Science Department, December 2003. Technical report CMU-CS-03-217.]]

Digital Library

[17]

Mihai Budiu and Seth Copen Goldstein. Compiling application-specific hardware. In International Conference on Field Programmable Logic and Applications (FPL), pages 853--863, Montpellier (La Grande-Motte), France, September 2002.]]

Digital Library

[18]

Mihai Budiu and Seth Copen Goldstein. Optimizing memory accesses for spatial computation. In International ACM/IEEE Symposium on Code Generation and Optimization (CGO), pages 216--227, San Francisco, CA, March 23-26 2003.]]

Digital Library

[19]

Mihai Budiu and Seth Copen Goldstein. Inter-iteration scalar replacement in the presence of conditional control-flow. Technical Report CMU-CS-04-103, Carnegie Mellon University, Department of Computer Science, 2004.]]

[20]

Mihai Budiu, Mahim Mishra, Ashwin Bharambe, and Seth Copen Goldstein. Peer-to-peer hardware-software interfaces for reconfigurable fabrics. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 57--66, Napa Valley, CA, April 2002.]]

Digital Library

[21]

Doug Burger and Todd M. Austin. The SimpleScalar tool set, version 2.0. In Computer Architecture News, volume 25, pages 13--25. ACM SIGARCH, June 1997.]]

Digital Library

[22]

Timothy J. Callahan and John Wawrzynek. Instruction level parallelism for reconfigurable computing. In Hartenstein and Keevallik, editors, International Conference on Field Programmable Logic and Applications (FPL), volume 1482 of Lecture Notes in Computer Science, Tallinin, Estonia, September 1998. Springer-Verlag.]]

Digital Library

[23]

Joao M. P. Cardoso and Markus Weinhardt. PXPP-VC: A C compiler with temporal partitioning for the PACT-XPP architecture. In International Conference on Field Programmable Logic and Applications (FPL), Montpellier (La Grande-Motte), France, September 2002.]]

Digital Library

[24]

Lori Carter, Beth Simon, Brad Calder, Larry Carter, and Jeanne Ferrante. Predicated static single assignment. In International Conference on Parallel Architectures and Compilation Techniques (PACT), October 1999.]]

Digital Library

[25]

Lori Carter, Beth Simon, Brad Calder, Larry Carter, and Jeanne Ferrante. Path analysis and renaming for predicated instruction scheduling. International Journal of Parallel Programming, special issue, 28(6), 2000.]]

[26]

Eylon Caspi, Michael Chu, Randy Huang, Joseph Yeh, Yury Markovskiy, Andre DeHon, and John Wawrzynek. Stream computations organized for reconfigurable execution (SCORE): Introduction and tutorial. In International Conference on Field Programmable Logic and Applications (FPL), Lecture Notes in Computer Science. Springer Verlag, 2000.]]

Digital Library

[27]

Tiberiu Chelcea and Steven M. Nowick. Resynthesis and peephole transformations for the optimization of large-scale asynchronous systems. In DAC, pages 405--410, New York, June 10--14 2002. ACM Press.]]

Digital Library

[28]

Fred Chow, Raymond Lo, Shin-Ming Liu, Sun Chan, and Mark Streich. Effective representation of aliases and indirect memory operations in SSA form. In International Conference on Compiler Construction (CC), pages 253--257, April 1996.]]

Digital Library

[29]

T.A.C.M. Claasen. High speed: not the only way to exploit the intrinsic computational power of silicon. In IEEE International Solid-State Circuits Conference, pages 22--25, San Francisco, CA, 1999. IEEE Catalog Number: 99CH36278.]]

[30]

Keith D. Cooper and Li Xu. An efficient static analysis algorithm to detect redundant memory operations. In Workshop on Memory Systems Performance (MSP '02), Berlin, Germany, June 2002.]]

Digital Library

[31]

Celoxica Corporation. Handel-C language reference manual, 2003.]]

[32]

CoWare, Inc. Flexible platform-based design with the CoWare N2C design system, October 2000.]]

[33]

David E. Culler and Arvind. Resource requirements of dataflow programs. In International Symposium on Computer Architecture (ISCA), pages 141--150, 1988.]]

Digital Library

[34]

R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems (TOPLAS), 13(4):451--490, 1991.]]

Digital Library

[35]

Ron Cytron and Reid Gershbein. Efficient accommodation of may-alias information in SSA form. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 36--45. ACM Press, 1993.]]

Digital Library

[36]

W. J. Dally and A. Chang. The role of custom design in ASIC chips. In Design Automation Conference (DAC), Los Angeles, CA, June 2000.]]

Digital Library

[37]

W. R. Davis, N. Zhang, K. Camera, D. Markovic, T. Smilkstein, M. J. Ammer, E. Yeo, S. Augsburger, B. Nikolic, and R. W. Brodersen. A design environment for high throughput, low power dedicated signal processing systems. IEEE Journal of Solid-State Circuits, 37(3):420--431, March 2002.]]

[38]

Andre DeHon. Very large scale spatial computing. In Third International Conference on Unconventional Models of Computation, 2002.]]

Digital Library

[39]

Jack B. Dennis. First version of a data flow procedure language. In Lecture Notes in Computer Science 19: Programming Symposium, pages 362--376. Springer-Verlag: Berlin, New York, 1974.]]

Digital Library

[40]

Pedro Diniz, Mary Hall, Joonseok Park, Byoungro So, and Heidi Ziegler. Bridging the gap between compilation and synthesis in the DEFACTO system. In Workshop on Languages and Compilers for Parallel Computing (LCPC), 2001.]]

Digital Library

[41]

Carl Ebeling, Darren C. Cronquist, Paul Franklin, Jason Secosky, and Stefan G. Berg. Mapping applications to the RaPiD configurable architecture. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 1997.]]

Digital Library

[42]

D. Edwards and A. Bardsley. Balsa: An asynchronous hardware synthesis language. The Computer J., 45(1):12--18, 2002.]]

[43]

Brian Fields, Rastislav Bodyk, and Mark D. Hill. Slack: Maximizing performance under technological constraints. In International Symposium on Computer Architecture (ISCA), pages 47--58, 2002.]]

Digital Library

[44]

David Mark Gallagher. Memory Disambiguation to Facilitate Instruction-Level Parallelism Compilation. PhD thesis, Graduate College of the University of Illinois at Urbana-Champaign, 1995.]]

[45]

Emden Gansner and Stephen North. An open graph visualization system and its applications to software engineering. Software Practice And Experience, 1(5), 1999. http://www.research.att.com/sw/tools/graphviz.]]

Digital Library

[46]

Guang R. Gao. A Pipelined Code Mapping Scheme for Static Data Flow Computers. PhD thesis, MIT Laboratory for Computer Science, 1986.]]

[47]

Varghese George, Hui Zhang, and Jan Rabaey. The design of a low energy FPGA. In International Symposium on Low-Power Design (ISLPED), pages 188--193. ACM Press, 1999.]]

Digital Library

[48]

A. Ghosh, J. Kunkel, and S. Liao. Hardware synthesis from C/C++. In Design, Automation and Test in Europe (DATE), pages 384--387, Munich, Germany, March 1999.]]

Digital Library

[49]

M. Gokhale and A. Marks. Automatic synthesis of parallel programs targeted to dynamically reconfigurable logic arrays. In W. Moore and W. Luk, editors, International Conference on Field Programmable Logic and Applications (FPL), pages 399--408, Oxford, England, August 1995. Springer.]]

Digital Library

[50]

M. Gokhale, J. Stone, J. Arnold, and M. Kalinowski. Stream-oriented FPGA computing in the Streams-C high level language. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 49--56, 2000.]]

Digital Library

[51]

Seth Copen Goldstein and Mihai Budiu. NanoFabrics: Spatial computing using molecular electronics. In International Symposium on Computer Architecture (ISCA), pages 178--189, Goteborg, Sweden, 2001.]]

Digital Library

[52]

Seth Copen Goldstein, Herman Schmit, Matthew Moe, Mihai Budiu, Srihari Cadambi, R. Reed Taylor, and Ronald Laufer. PipeRench: a coprocessor for streaming multimedia acceleration. In International Symposium on Computer Architecture (ISCA), pages 28--39, Atlanta, GA, 1999.]]

Digital Library

[53]

R. Gonzalez and M. Horowitz. Supply and threshold voltage scaling for low power CMOS. IEEE Journal of Solid-State Circuits, 32(8), August 1997.]]

[54]

Sumit Gupta, Nick Savoiu, Nikil Dutt, Rajesh Gupta, Alex Nicolau, Timothy Kam, Michael Kishinevsky, and Shai Rotem. Coordinated transformations for high-level synthesis of high performance microprocessor blocks. In Design Automation Conference (DAC), pages 898--903. ACM Press, 2002.]]

Digital Library

[55]

Sumit Gupta, Nick Savoiu, Sunwoo Kim, Nikil D. Dutt, Rajesh K. Gupta, and Alexandru Nicolau. Speculation techniques for high level synthesis of control intensive designs. In Design Automation Conference (DAC), pages 269--272, 2001.]]

Digital Library

[56]

R. Ho, K. Mai, and M. Horowitz. The future of wires. IEEE Journal, 89(4):490--504, April 2001.]]

[57]

Hoare. Communicating sequential processes. In C. A. A. Hoare and C. B. Jones (Ed.), Essays in Computing Science, Prentice Hall. 1989.]]

[58]

James C. Hoe and Arvind. Synthesis of operation-centric hardware descriptions. In IEEE/ACM International Conference on Computer-aided design (ICCAD), San Jose, California, November 2000.]]

Digital Library

[59]

Doug Johnson. Programming a Xilinx FPGA in "C". Xcell Quarterly Journal, 34, 1999.]]

[60]

Andrew Kay, Toshio Nomura, Akihisa Yamada, Koichi Nishida, Ryoji Sakurai, and Takashi Kambe. Hardware synthesis with Bach system. In IEEE International Symposium on Circuits and Systems (ISCAS), Orlando, 1999.]]

[61]

Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Software Series. Prentice Hall, 2 edition, 1988.]]

Digital Library

[62]

H. T. Kung. Why systolic architectures? IEEE Computer, 15(1):37--46, 1982.]]

Digital Library

[63]

Monica S. Lam and Robert P. Wilson. Limits of control flow on parallelism. In International Symposium on Computer Architecture (ISCA), 1992.]]

Digital Library

[64]

Christopher Lapkowski and Laurie J. Hendren. Extended SSA numbering: Introducing SSA properties to languages with multi-level pointers. In the 1998 International Conference on Compiler Construction, volume 1383 of Lecture Notes in Computer Science, pages 128--143, March 1998.]]

Digital Library

[65]

Luciano Lavagno and Ellen Sentovich. ECL: A specification environment for system-level design. In Design Automation Conference (DAC), pages 511--516, New Orleans, LA, June 1999.]]

Digital Library

[66]

Chunho Lee, Miodrag Potkonjak, and William H. Mangione-Smith. MediaBench: a tool for evaluating and synthesizing multimedia and communications systems. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 330--335, 1997.]]

Digital Library

[67]

Walter Lee, Rajeev Barua, Matthew Frank, Devabhaktuni Srikrishna, Jonathan Babb, Vivek Sarkar, and Saman Amarasinghe. Space-time scheduling of instruction-level parallelism on a Raw machine. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 46--57, 1998.]]

Digital Library

[68]

Yanbing Li, Tim Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, and Jon Stockwood. Hardware-software co-design of embedded reconfigurable architectures. In Design Automation Conference (DAC), 2000.]]

Digital Library

[69]

Stan Liao, Steven W. K. Tjiang, and Rajesh Gupta. An efficient implementation of reactivity for modeling hardware in the Scenic design environment. In Design Automation Conference (DAC), pages 70--75, 1997.]]

Digital Library

[70]

Andrew Matthew Lines. Pipelined asynchronous circuits. Master's thesis, California Institute of Technology, Computer Science Department, 1995. CS-TR-95-21.]]

[71]

Raymond Lo, Fred Chow, Robert Kennedy, Shin-Ming Liu, and Peng Tu. Register promotion by sparse partial redundancy elimination of loads and stores. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 26--37. ACM Press, 1998.]]

Digital Library

[72]

John Lu and Keith D. Cooper. Register promotion in C programs. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 308--319. ACM Press, 1997.]]

Digital Library

[73]

Scott A. Mahlke, Richard E. Hauk, James E. McCormick, David I. August, and Wen mei W. Hwu. A comparison of full and partial predicated execution support for ILP processors. In International Symposium on Computer Architecture (ISCA), pages 138--149, Santa Margherita Ligure, Italy, May 1995. ACM.]]

Digital Library

[74]

Scott A. Mahlke, David C. Lin, William Y. Chen, Richard E. Hank, and Roger A. Bringmann. Effective compiler support for predicated execution using the hyperblock. In International Symposium on Computer Architecture (ISCA), pages 45--54, Dec 1992.]]

Digital Library

[75]

Ken Mai, Tim Paaske, Nuwan Jayasena, Ron Ho, William J. Dally, and Mark Horowitz. Smart memories: A modular reconfigurable architecture. In International Symposium on Computer Architecture (ISCA), June 2000.]]

Digital Library

[76]

A. J. Martin. Programming in VLSI: From communicating processes to delay-insensitive circuits. In C. A. R. Hoare, editor, Developments in Concurrency and Communication, UT Year of Programming Series, pages 1--64. Addison-Wesley, 1990.]]

Digital Library

[77]

Alain J. Martin, Mika Nystrm, Karl Papadantonakis, Paul I. Penzes, Piyush Prakash, Catherine G. Wong, Jonathan Chang, Kevin S. Ko, Benjamin Lee, Elaine Ou, James Pugh, Eino-Ville Talvala, James T. Tong, and Ahmet Tura. The Lutonium: A sub-nanojoule asynchronous 8051 microcontroller. In International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC), May 2003.]]

Digital Library

[78]

Tsutomu Maruyama and Tsutomu Hoshino. A C to HDL compiler for pipeline processing on FPGAs. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 2000.]]

Digital Library

[79]

D. May. OCCAM. SIGPLAN Notices, 18(4):69--79, May 1983.]]

Digital Library

[80]

Giovanni De Micheli. Hardware synthesis from C/C++ models. In Design, Automation and Test in Europe (DATE), Munich, Germany, 1999.]]

Digital Library

[81]

David E. Muller and W. S. Bartky. A theory of asynchronous circuits. In International Symposium on the Theory of Switching Functions, pages 204--243, 1959.]]

[82]

Karl J. Ottenstein, Robert A. Ballance, and Arthur B. Maccabe. The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 257--271, 1990.]]

Digital Library

[83]

Keshav Pingali, Micah Beck, Richard Johnson, Mayan Moudgill, and Paul Stodghill. Dependence flow graphs: An algebraic approach to program dependencies. In ACM Symposium on Principles of Programming Languages (POPL), volume 18, 1991.]]

Digital Library

[84]

Rahul Razdan and Michael D. Smith. A high-performance microarchitecture with hardware-programmed functional units. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 172--180, November 1994.]]

Digital Library

[85]

Robert B. Reese, Mitch A. Thornton, and Cherrice Traver. Arithmetic logic circuits using self-timed bit level dataflow and early evaluation. In International Conference on Computer Design (ICCD), page 18, Austin, TX, September 23-26 2001.]]

[86]

R. Rinker, M. Carter, A. Patel, M. Chawathe, C. Ross, J. Hammes, W. Najjar, and A.P.W. Bohm. An automated process for compiling dataflow graphs into hardware. IEEE Transactions on VLSI, 9 (1), February 2001.]]

Digital Library

[87]

Scott Rixner, William J. Dally, Ujval J. Kapasi, Brucek Khailany, Abelardo Lopez-Lagunas, Peter R. Mattson, and John D. Owens. A bandwidth-efficient architecture for media processing. In IEEE/ACM International Symposium on Microarchitecture (MICRO), December 1998.]]

Digital Library

[88]

Ray Roth and Dinesh Ramanathan. A high-level design methodology using C++. In IEEE International High Level Design Validation and Test Workshop, November 1999.]]

[89]

K. Sankaralingam, R. Nagarajan, D.C. Burger, and S.W. Keckler. A technology-scalable architecture for fast clocks and high ILP. In Workshop on the Interaction of Compilers and Computer Architecture, January 2001.]]

[90]

P. Schaumont, S. Vernalde, L. Rijnders, M. Engels, and I. Bolsens. A programming environment for the design of complex high speed ASICs. In Design Automation Conference (DAC), pages 315--320, San Francisco, June 1998.]]

Digital Library

[91]

Klaus E. Schauser and Seth C. Goldstein. How much non-strictness do lenient programs require? In International Conference on Functional Programming Languages and Computer Architecture, pages 216--225. ACM Press, 1995.]]

Digital Library

[92]

M. Schlansker, T.M. Conte, J. Dehnert, K. Ebcioglu, J.Z. Fang, and C.L. Thompson. Compilers for instruction-level parallelism. IEEE Computer, 30(12):63--69, 1997. This was a report from a cross-industry task force on ILP.]]

Digital Library

[93]

R. Schreiber, S. Aditya (Gupta), B.R. Rau, S. Mahlke, V. Kathail, B. Ra. Rau, D. Cronquist, and M. Sivaraman. PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators. Journal of VLSI Signal Processing, 2001.]]

[94]

Luc Semeria, Koichi Sato, and Giovanni De Micheli. Synthesis of hardware models in C with pointers and complex data structures. IEEE Transactions on VLSI, 2001.]]

Digital Library

[95]

Greg Snider, Barry Shackleford, and Richard J. Carter. Attacking the semantic gap between application programming languages and configurable hardware. In ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), pages 115--124. ACM Press, 2001.]]

Digital Library

[96]

Donald Soderman and Yuri Panchul. Implementing C algorithms in reconfigurable hardware using C2Verilog. In Kenneth L. Pocek and Jeffrey Arnold, editors, IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 339--342, Los Alamitos, CA, April 1998. IEEE Computer Society Press.]]

Digital Library

[97]

Bjarne Steensgaard. Sparse functional stores for imperative programs. In ACM SIGPLAN Workshop on Intermediate Representations, pages 62--70, 1995.]]

Digital Library

[98]

Ivan Sutherland. Micropipelines: Turing award lecture. Communications of the ACM, 32 (6):720--738, June 1989.]]

Digital Library

[99]

Steven Swanson, Ken Michelson, and Mark Oskin. WaveScalar. Technical Report 2003-01-01, Washington University at Seattle, Computer Science Department, January 2003.]]

[100]

A. Takayama, Y. Shibata, K. Iwai, H. Miyazaki, K. Higure, and X.-P. Ling. Implementation and evaluation of the compiler for WASMII, a virtual hardware system. In International Workshop on Parallel Processing, pages 346--351, 1999.]]

Digital Library

[101]

John Teifel and Rajit Manohar. Static tokens: Using dataflow to automate oncurrent pipeline synthesis. In International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC), pages 17--27, Heraklion, Crete, Greece, April 2004.]]

[102]

Herve Touati and Mark Shand. PamDC: a C++ library for the simulation and generation of Xilinx FPGA designs. http://research.compaq.com/SRC/pamette/PamDC.pdf, 1999.]]

[103]

Y-F. Tsai, D. Duarte, N. Vijaykrishnan, and M.J. Irwin. Implications of technology scaling on leakage reduction techniques. In Design Automation Conference (DAC), San Diego, CA, June 2004.]]

Digital Library

[104]

Kees van Berkel. Handshake Circuits: An Asynchronous Architecture for VLSI Programming, volume 5 of Intl. Series on Parallel Computation. Cambridge University Press, 1993.]]

Digital Library

[105]

A. H. Veen and R. van den Born. The RC compiler for the DTN dataflow computer. Journal of Parallel and Distributed Computing, 10:319--332, 1990.]]

Digital Library

[106]

Arthur H. Veen. Dataflow machine architecture. ACM Computing Surveys, 18 (4):365--396, 1986.]]

Digital Library

[107]

Girish Venkataramani, Mihai Budiu, and Seth Copen Goldstein. C to asynchronous dataflow circuits: An end-to-end toolflow. In International Workshop on Logic Syntheiss, Temecula, CA, June 2004.]]

[108]

John von Neumann. First draft of a report on the EDVAC. Contract No. W-670-ORD-492, Moore School of Electrical Engineering, University of Pennsylvania, Philadelphia. Reprinted (in part) in Randell, Brian. 1982. Origins of Digital Computers: Selected Papers, Springer-Verlag, Berlin Heidelberg, June 1945.]]

[109]

Kazutoshi Wakabayashi and Takumi Okamoto. C-based SoC design flow and EDA tools: An ASIC and system vendor perspective. IEEE Transactions on Computer-Aided Design, 19(12):1507--1522, December 2000.]]

Digital Library

[110]

M. Wazlowski, L. Agarwal, T. Lee, A. Smith, E. Lam, P. Athanas, H. Silverman, and S. Ghosh. PRISM-II compiler and architecture. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 9--16, Napa Valley, CA, Apr 1993.]]

[111]

Robert P. Wilson, Robert S. French, Christopher S. Wilson, Saman P. Amarasinghe, Jennifer M. Anderson, Steve W. K. Tjiang, Shih-Wei Liao, Chau-Wen Tseng, Mary W. Hall, Monica S. Lam, and John L. Hennessy. SUIF: An infrastructure for research on parallelizing and optimizing compilers. In ACM SIGPLAN Notices, volume 29, pages 31--37, December 1994.]]

Digital Library

[112]

Niklaus Wirth. Hardware compilation: Translating programs into circuits. IEEE Computer, 31 (6):25--31, June 1998.]]

Digital Library

[113]

M. J. Wirthlin and B. L. Hutchings. A dynamic instruction set computer. In P. Athanas and K. L. Pocek, editors, IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 99--107, Napa, CA, April 1995.]]

Digital Library

[114]

R. D. Wittig and P. Chow. OneChip: An FPGA processor with reconfigurable logic. In J. Arnold and K. L. Pocek, editors, IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 126--135, Napa, CA, April 1996.]]

[115]

Alex Zhi Ye, Andreas Moshovos, Scott Hauck, and Prithviraj Banerjee. CHIMAERA: A high-performance architecture with a tightly-coupled reconfigurable unit. In International Symposium on Computer Architecture (ISCA), ACM Computer Architecture News. ACM Press, 2000.]]

Digital Library

[116]

Ning Zhang and Bob Brodersen. The cost of flexibility in systems on a chip design for signal processing applications. http://bwrc.eecs.berkeley.edu/Classes/EE225C/Papers/arch design.doc, Spring 2002.]]

Cited By

Chien ASnavely AGahagan M(2011)10x10: A General-purpose Architectural Approach to Heterogeneity and Energy EfficiencyProcedia Computer Science10.1016/j.procs.2011.04.2174(1987-1996)Online publication date: 2011
https://doi.org/10.1016/j.procs.2011.04.217
Buyukkurt BCortes JVillarreal JNajjar W(2010)Impact of high-level transformations within the ROCCC frameworkACM Transactions on Architecture and Code Optimization10.1145/1880043.18800447:4(1-36)Online publication date: 30-Dec-2010
https://dl.acm.org/doi/10.1145/1880043.1880044
Deng JTang XZhang JLi YZhang LHan BHe HTu FLiu LWei SHu YYin S(2023)Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow PlaneProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614246(1395-1408)Online publication date: 28-Oct-2023
https://dl.acm.org/doi/10.1145/3613424.3614246
Show More Cited By

Index Terms

Recommendations

Spatial computation
ASPLOS XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems

This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized ...
Spatial computation
ASPLOS '04

This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized ...
Spatial computation
ASPLOS '04

This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 32, Issue 5

ASPLOS 2004

December 2004

283 pages

ISSN:0163-5964

DOI:10.1145/1037947

Issue’s Table of Contents

ASPLOS XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
October 2004
296 pages
ISBN:1581138040
DOI:10.1145/1024393
General Chair:
Shubu Mukherjee
Intel Corporation
,
Program Chair:
Kathryn S. McKinley
University of Texas at Austin

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2004

Published in SIGARCH Volume 32, Issue 5

Check for updates

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

77
Total Citations
View Citations
2,191
Total Downloads

Downloads (Last 12 months)60
Downloads (Last 6 weeks)2

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chien ASnavely AGahagan M(2011)10x10: A General-purpose Architectural Approach to Heterogeneity and Energy EfficiencyProcedia Computer Science10.1016/j.procs.2011.04.2174(1987-1996)Online publication date: 2011
https://doi.org/10.1016/j.procs.2011.04.217
Buyukkurt BCortes JVillarreal JNajjar W(2010)Impact of high-level transformations within the ROCCC frameworkACM Transactions on Architecture and Code Optimization10.1145/1880043.18800447:4(1-36)Online publication date: 30-Dec-2010
https://dl.acm.org/doi/10.1145/1880043.1880044
Deng JTang XZhang JLi YZhang LHan BHe HTu FLiu LWei SHu YYin S(2023)Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow PlaneProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614246(1395-1408)Online publication date: 28-Oct-2023
https://dl.acm.org/doi/10.1145/3613424.3614246
Castells-Rufas DNgo VBorrego-Carazo JCodina MSanchez CGil DCarrabina J(2022)A Survey of FPGA-Based Vision Systems for Autonomous CarsIEEE Access10.1109/ACCESS.2022.323028210(132525-132563)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3230282
Wei SLiu LZhu JDeng CWei SLiu LZhu JDeng C(2022)Overview of SDCSoftware Defined Chips10.1007/978-981-19-6994-2_2(27-76)Online publication date: 21-Oct-2022
https://doi.org/10.1007/978-981-19-6994-2_2
Nguyen QSanchez D(2021)Fifer: Practical Acceleration of Irregular Applications on Reconfigurable ArchitecturesMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480048(1064-1077)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480048
Zhang YZhang NZhao TVilim MShahbaz MOlukotun K(2021)SARA: Scaling a Reconfigurable Dataflow Accelerator2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA52012.2021.00085(1041-1054)Online publication date: Jun-2021
https://doi.org/10.1109/ISCA52012.2021.00085
Li RBerkley LYang YManohar R(2021)Fluid: An Asynchronous High-level Synthesis Tool for Complex Program Structures2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)10.1109/ASYNC48570.2021.00009(1-8)Online publication date: Sep-2021
https://doi.org/10.1109/ASYNC48570.2021.00009
Jo GKim HLee JLee J(2020)SOFF: An OpenCL High-Level Synthesis Framework for FPGAs2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA45697.2020.00034(295-308)Online publication date: May-2020
https://doi.org/10.1109/ISCA45697.2020.00034
Weng JLiu SWang ZDadu VNowatzki T(2020)A Hybrid Systolic-Dataflow Architecture for Inductive Matrix Algorithms2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00063(703-716)Online publication date: Feb-2020
https://doi.org/10.1109/HPCA47549.2020.00063
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents