research-article

A high-performance FPGA architecture for restricted boltzmann machines

Authors:
Daniel L. Ly

University of Toronto, Toronto, ON, Canada

University of Toronto, Toronto, ON, Canada
View Profile

,
Paul Chow

University of Toronto, Toronto, ON, Canada

University of Toronto, Toronto, ON, Canada
View Profile

FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arraysFebruary 2009Pages 73–82https://doi.org/10.1145/1508128.1508140

Published:22 February 2009Publication History

FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Pages 73–82

ABSTRACT

Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications have been limited. A primary cause of this lack of adoption is due to the fact that neural networks are usually implemented as software running on general-purpose processors. Algorithms to implement a neural network in software are typically O(n²) problems -- as a result, neural networks are unable to provide the performance and scalability required in non-academic settings.

In this paper, we investigate how FPGAs can be used to take advantage of the inherent parallelism in neural networks to provide a better implementation in terms of scalability and performance. We will focus on the Restricted Boltzmann machine, a popular type of neural network, because its architecture is particularly well-suited to hardware designs. The proposed, multi-purpose hardware framework is designed to reduce the O(n²2) problem into an O(n) implementation while only requiring O(n) resources. The framework is tested on a Xilinx Virtex II-Pro XC2VP70 FPGA running at 100MHz. The resources support a Restricted Boltzmann machine of 128x128 nodes, which results in a computational speed of 1.02 billion connection-updates-per-second and a speed-up of 35 fold over an optimized C program running on a 2.8GHz Intel processor.

References

G. E. Hinton, S. Osindero, and Y. Teh, "A Fast Learning Algorithm for Deep Belief Nets," Neural Computation, vol. 18, pp. 1527--1554, 2006. Google ScholarDigital Library
G. E. Hinton and R. R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks," Science, vol. 313, pp. 504--507, July 2006.Google ScholarCross Ref
C. S. Lindsey and T. Lindblad, "Survey of neural network hardware," Applications and Science of Artificial Neural Networks, pp. 1194--1205, 1995.Google ScholarCross Ref
Y. Liao, "Neural Networks in Hardware: A Survey," tech. rep., Santa Cruz, CA, USA, 2001.Google Scholar
J. Zhu and P. Sutton, "FPGA Implementations of Neural Networks -- A Survey of a Decade of Progress," Lecture Notes in Computer Science, no. 2778, pp. 1062--1066, 2003.Google ScholarCross Ref
P. Ferreira, P. Ribeiro, A. Antunes, and F. M. Dias, "A high bit resolution FPGA implementation of a FNN with a new algorithm for the activation function," Neurocomputing, vol. 71, pp. 71--77, 2007. Google ScholarDigital Library
D. Shen, L. Jin, and X. Ma, "FPGA Implementation of Feature Extraction and Neural Network Classifier for Handwritten Digit Recognition," Lecture notes in computer science, vol. 3173, pp. 988--995, 2004.Google Scholar
P. Smolensky, Information processing in dynamical systems: Foundations of harmony theory. Parallel Distributed Processing: Volume 1: Foundations, MIT Press, Cambridge, MA, 1986. Google ScholarDigital Library
Y. Freund and D. Haussler, "Unsupervised Learning of Distributions on Binary Vectors Using Two Layer Networks," NIPS, pp. 912--919, 1992.Google Scholar
D. Geman and S. Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721--741, 1984.Google ScholarDigital Library
D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, "A Learning Algorithm for Boltzmann Machines," Cognitive Science, vol. 9, pp. 147--169, 1985.Google ScholarCross Ref
G. E. Hinton and T. J. Sejnowski, Learning and relearning in Boltzmann machines. Parallel Distributed Processing: Volume 1: Foundations, MIT Press, Cambridge, MA, 1986. Google ScholarDigital Library
M. Saldana and P. Chow, "TMD-MPI: An MPI Implementation for Multiple Processors across Multiple FPGAs," IEEE International Conference on Field-Programmable Logic and Applications (FPL 2006), pp. 329--334, 2006.Google Scholar
M. A. Carreira-Perpinan and G. E. Hinton, "On Contrastive Divergence Learning," Artificial Intelligence and Statistics, 2005.Google Scholar
P. Ferreira, P. Ribeiro, A. Antunes, and F. M. Dias, "A high bit resolution FPGA implementation of a FNN with a new algorithm for the activation function," Neurocomputing, vol. 71, pp. 71--77, 2007. Google ScholarDigital Library

Index Terms

A high-performance FPGA architecture for restricted boltzmann machines
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Computing methodologies
  1. Machine learning

Recommendations

A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network

Artificial neural networks (ANNs) are a natural target for hardware acceleration by FPGAs and GPGPUs because commercial-scale applications can require days to weeks to train using CPUs, and the algorithms are highly parallelizable. Previous work on ...
Read More
Building a multi-FPGA virtualized restricted boltzmann machine architecture using embedded MPI
FPGA '11: Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays

Several FPGA architectures exist for accelerating Restricted Boltzmann Machines (RBMs). However, the network size for most is limited by the amount of available on-chip memory. Therefore, many FPGAs are required to implement very large networks for use ...
Read More
High-performance reconfigurable hardware architecture for restricted Boltzmann machines

Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
February 2009
302 pages
ISBN:9781605584102
DOI:10.1145/1508128
General Chair:
Paul Chow
University of Toronto, Canada
,
Program Chair:
Peter Cheung
Imperial College London, UK
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 February 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
complexity reduction
fpga
high-performance computing
neural network hardware
restricted boltzmann machines
scalable hardware designs
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate125of627submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 36
  Total Citations
  View Citations
- 549
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A high-performance FPGA architecture for restricted boltzmann machines

FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network

Building a multi-FPGA virtualized restricted boltzmann machine architecture using embedded MPI

High-performance reconfigurable hardware architecture for restricted Boltzmann machines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A high-performance FPGA architecture for restricted boltzmann machines

FPGA '09: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network

Building a multi-FPGA virtualized restricted boltzmann machine architecture using embedded MPI

High-performance reconfigurable hardware architecture for restricted Boltzmann machines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media