ABSTRACT
This paper describes an FPGA based implementation of a real time compression algorithm used in transactions between financial institutions such as exchanges and trading houses. FIX is a protocol that has gained widespread popularity for exchanging financial information such as stock prices and purchases over the Internet. If a financial trader can speed up the processing of these protocols, he can make significant financial profits by buying or selling stocks when there is a lot of variability in the share prices. Our methodology tries to recognize and exploit streaming characteristics of the software design in order to implement a pipelined parallel processing system in reconfigurable hardware. It introduces the concept of caches to keep stream pipelines filled more often. The system implemented on a Xilinx Virtex5 LX110T FPGA shows a 17x speedup in throughput over a software implementation running on a dual core Intel Pentium workstation. These techniques are being developed as part of commercial compiler project to automatically translate software binaries to streaming RTL VHDL systems.
- FAST Specification 1.x.1, 2006-12-20, FAST ProtocolSM, FIX Protocol Ltd, http://www.fixprotocol.org.Google Scholar
- Field Encoding Specification, 1.0, 2006-1-11, FAST ProtocolSM, FIX Protocol Ltd, http://www.fixprotocol.org.Google Scholar
- FIX 5.0 Specification, FIX Protocol Ltd, http://www.fixprotocol.org.Google Scholar
- M. Wolfe, "High performance compilers for parallel computing," Addison-Wesley Publishing, 1996, 260--277 Google ScholarDigital Library
- M.A. Franklin, E.J. Tyson, J. Buckley, P. Crowley, J. Maschmeyer, "Auto-pipe and the X language: a pipeline design tool and description language," in Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. Google ScholarDigital Library
- M.B. Gokhale, J.M. Stone, J. Arnold, and M. Kalinowski, "Stream-Oriented FPGA Computing in the Streams-C High Level Language," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, April 2000. Google ScholarDigital Library
- N. Bellas, S.M. Chai, M. Dwyer, D. Linzmeier, "FPGA implementation of a license plate recognition SoC using automatically generated streaming accelerators," in Proc. 20th IEEE International Parallel & Distributed Processing Symposium, 2006. Google ScholarDigital Library
- R.D. Chamberlain and M.A. Franklin, "Automatic Deployment of Streaming Applications on Hybrid Architectures," in Proc. of 11th High Performance Embedded Computing Workshop, September 2007.Google Scholar
- R.D. Chamberlain, J.M. Lancaster, and R.K. Cytron, "Visions for Application Development on Hybrid Computing Systems," Parallel Computing, May 2008. Google ScholarDigital Library
- S. Ciricescu, R. Essick, B. Lucas, P. May, K. Moat, et al. "The reconfigurable streaming vector processor (RSVPTM)," in Proc. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003, pp. 141--150 Google ScholarDigital Library
- S.M. Chai, N. Bellas, M. Dwyer and D. Linzmeier, "Stream Memory Subsystem in Reconfigurable Platforms," in 2nd Workshop on Architecture Research using FPGA Platforms, 2006.Google Scholar
- U. Kapasi, W. Dally, S. Rixner, J. Owens, and B. Khailany, "The Imagine stream processor," in Proc. International Conference on Computer Design, 2002, pp. 282--288 Google ScholarDigital Library
- W. Dally, F. Labonte, A. Das, P. Hanrahan, A. Jung-Ho, et al. "Merrimac: supercomputing with streams," in ACM/IEEE Conference of Supercomputing, 2003, pp. 35--42 Google ScholarDigital Library
- W. Thies, M. Karczmarek, and S. Amarasinghe, "StreamIt: A compiler for streaming applications," MIT-LCS Technical Memo LCS-TM-622, Cambridge, MA, 2001.Google Scholar
- Xilinx PCI Express Endpoint Block Plus v1.5 datasheet DS551. Xilinx, Inc. http://www.xilinx.comGoogle Scholar
- Xilinx Virtex-5 Embedded Tri-Mode Ethernet MAC wrapper v1.3 datasheet DS550, Xilinx, Inc. http://www.xilinx.comGoogle Scholar
Index Terms
- Streaming implementation of a sequential decompression algorithm on an FPGA
Recommendations
Implementation of FFT on General-Purpose Architectures for FPGA
This paper describes two general-purpose architectures targeted to Field Programmable Gate Array FPGA implementation. The first architecture is based on the coupling of a coarse-grain reconfigurable array with a general-purpose processor core. The ...
Automatic translation of software binaries onto FPGAs
DAC '04: Proceedings of the 41st annual Design Automation ConferenceThe introduction of advanced FPGA architectures, with built-in DSP support, has given DSP designers a new hardware alternative. By exploiting its inherent parallelism, it is expected that FPGAs can outperform DSP processors. This paper describes the ...
Investigation into scaling I/O bound streaming applications productively with an all-FPGA cluster
The Reconfigurable Computing Cluster project is exploring novel parallel computing architectures in high performance computing with FPGA devices. Although there are no discrete microprocessors in the system, highly-integrated FPGAs (with embedded ...
Comments