ABSTRACT
We present a survey of approximate techniques and discuss concepts for building power-/energy-efficient computing components reaching from approximate accelerators to arithmetic blocks (like adders and multipliers). We provide a systematical understanding of how to generate and explore the design space of approximate components, which enables a wide-range of power/energy, performance, area and output quality tradeoffs, and a high degree of design flexibility to facilitate their design. To enable cross-layer approximate computing, bridging the gap between the logic layer (i.e. arithmetic blocks) and the architecture layer (and even considering the software layers) is crucial. Towards this end, this paper introduces open-source libraries of low-power and high-performance approximate components. The elementary approximate arithmetic blocks (adder and multiplier) are used to develop multi-bit approximate arithmetic blocks and accelerators. An analysis of data-driven resilience and error propagation is discussed. The approximate computing components are a first steps towards a systematic approach to introduce approximate computing paradigms at all levels of abstractions.
- A. K. Mishra, R. Barik, S. Paul, "iACT: A Software-Hardware Framework for Understanding the Scope of Approximate Computing", Workshop on Approximate Computing Across the System Stack (WACAS), 2014.Google Scholar
- R. Nair, "Big data needs approximate computing: technical perspective", ACM Communications, vol. 58, no. 1, pp. 104, 2015. Google ScholarDigital Library
- J. Bornholt, T. Mytkowicz, K. S. McKinley, "Uncertain<T>: Abstractions for Uncertain Hardware and Software", IEEE Micro 35(3): 132--143, 2015.Google ScholarDigital Library
- H. Esmaeilzadeh, A. Sampson, L. Ceze, D. Burger, "Architecture support for disciplined approximate programming", International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2012. Google ScholarDigital Library
- S. Misailovic, M. Carbin, S. Achour, Z. Qi, M. C. Rinard, "Chisel: reliability- and accuracy-aware optimization of approximate computational kernels", ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA), 309--328, 2014. Google ScholarDigital Library
- V. Chippa, S. Chakradhar, K. Roy, A. Raghunathan, "Analysis and characterization of inherent application resilience for approximate computing", ACM/IEEE Design Automation Conference (DAC), 2013. Google ScholarDigital Library
- A. K. Verma, P. Brisk, P. Ienne, "Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design", IEEE/ACM Design, Automation and Test in Europe (DATE), pp. 1250--1255, 2008. Google ScholarDigital Library
- N. Zhu, W.-L. Goh, K.-S. Yeo, "An enhanced low-power high-speed Adder for Error-Tolerant application", International Symposium on Integrated Circuits (ISIC), pp. 69--72, 2009.Google Scholar
- A. B. Kahng, S. Kang, "Accuracy-configurable adder for approximate arithmetic designs", ACM/IEEE Design Automation Conference (DAC), pp. 820--825, 2012. Google ScholarDigital Library
- J. Miao, K. He, A. Gerstlauer, M. Orshansky, "Modeling and synthesis of quality-energy optimal approximate adders", IEEE International Conference on Computer Aided Design (ICCAD), pp. 728--735, 2012. Google ScholarDigital Library
- V. Gupta, D. Mohapatra, A. Raghunathan, K. Roy, "Low-Power Digital Signal Processing Using Approximate Adders", IEEE Transaction on CAD of Integrated Circuits and Systems (TCAD), vol. 32, no. 1, pp. 124--137, 2013. Google ScholarDigital Library
- V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan, "IMPACT: IMPrecise adders for low-power approximate computing", International Symposium on Low Power Electronics and Design (ISLPED), pp. 409--414, 2011. Google ScholarDigital Library
- R. Ye, T. Wang, F. Yuan, R. Kumar, Q. Xu, "On reconfiguration-oriented approximate adder design and its application", IEEE International Conference on Computer-Aided Design (ICCAD), pp. 48--54, 2013. Google ScholarDigital Library
- M. Shafique, W. Ahmad, R. Hafiz, J. Henkel, "A Low Latency Generic Accuracy Configurable Adder", ACM/IEEE Design Automation Conference (DAC), 2015. Google ScholarDigital Library
- P. Kulkarni, P. Gupta, M. Ercegovac, "Trading Accuracy for Power with an Underdesigned Multiplier Architecture", International Conference on VLSI Design (VLSI Design), pp. 346--351, 2011. Google ScholarDigital Library
- M. B. Sullivan, E. E. Swartzlander, "Truncated error correction for flexible approximate multiplication", Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 355--359, 2012.Google ScholarCross Ref
- K. Bhardwaj, P. S. Mane, J. Henkel, "Power- and Area-Efficient Approximate Wallace Tree Multiplier for Error-Resilience Systems", International Symposium on Quality Electronic Design (ISQED), pp. 263--269, 2014.Google ScholarCross Ref
- A. Raghunathan and K. Roy, "Approximate Computing Across the Stack: Architecture and Systems", Approximate Computing Workshop (AC), 2015.Google Scholar
- J. Henkel, L. Bauer, N. Dutt, P. Gupta, S. Nassif, M. Shafique, M. Tahoori, N. Wehn, "Reliable On-Chip Systems in the Nano-Era: Lessons Learnt and Future Trends", ACM/IEEE Design Automation Conference (DAC), 2013. Google ScholarDigital Library
- H. Hoffmann, S. Misailovic, S. Sidiroglou, A. Agarwal, M. Rinard, "Using code perforation to improve performance, reduce energy consumption, respond to failures", MIT Technical Report: MIT-CSAIL-TR 2009-042, 2009.Google Scholar
- S. Sidiroglou, S. Misailovic, H. Hoffmann, M. Rinard, "Managing Performance vs. Accuracy Trade-offs With Loop Perforation", ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering (ESEC/FSE), pp. 124--134, 2011. Google ScholarDigital Library
- Sasa Misailovic, Deokhwan Kim, Martin Rinard, "Parallelizing sequential programs with statistical accuracy tests", MIT Technical Report: MIT-CSAIL-TR-2010-038, 2010.Google Scholar
- J. Mengte, A. Raghunathan, S. Chakradhar, S. Byna, "Exploiting the forgiving nature of applications for scalable parallel execution", IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1--12 2010.Google ScholarCross Ref
- H. Esmaeilzadeh, A. Sampson, L. Ceze, D. Burger, "Neural acceleration for general-purpose approximate programs", International. Symposium on Microarchitecture (MICRO), pp. 449--460, 2012. Google ScholarDigital Library
- S. Venkataramani, A. Ranjan, K. Roy, A. Raghunathan, "AxNN: energy-efficient neuromorphic systems using approximate computing", International symposium on Low power electronics and design (ISLPED), pp. 27--32, 2014. Google ScholarDigital Library
- S. Venkataramani, A. Raghunathan, J. Liu, M. Shoaib, "Scalable-effort classifiers for energy-efficient machine learning", ACM/IEEE Design Automation Conference (DAC), 2015. Google ScholarDigital Library
- I. J. Chang, J. Chang, D. Mohapatra, K. Roy, "A Priority-Based 6T/8T Hybrid SRAM Architecture for Aggressive Voltage Scaling in Video Applications", IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 21, no. 2, pp. 101--112, 2011. Google ScholarDigital Library
- G. Karakonstantis, D. Mohapatra, K. Roy, "Logic and memory design based on unequal error protection for voltage-scalable, robust and adaptive dsp systems", Journal of Signal Processing Systems (JSPS), vol. 68, no. 3, pp. 415--431, 2012. Google ScholarDigital Library
- S. Venkataramani, V. K. Chippa, S. T. Chakradhar, K. Roy, A. Raghunathan, "Quality programmable vector processors for approximate computing", IEEE/ACM International Symposium on Microarchitecture (MICRO), 2013. Google ScholarDigital Library
- V. K. Chippa, D. Mohapatra, K. Roy, S. T. Chakradhar, A. Raghunathan, "Scalable Effort Hardware Design", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 22, no. 9, pp. 2004--2016, 2014.Google ScholarCross Ref
- V. K. Chippa, S. Venkataramani, K. Roy, and A. Raghunathan.StoRM: A stochastic recognition and mining processor. In Proc. ISLPED, pages 39--44, 2014. Google ScholarDigital Library
- C. Liu, J. Han, F. Lombardi, "A low-power, high-performance approximate multiplier with configurable partial error recovery", Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014. Google ScholarDigital Library
- F. Farshchi, M. S. Abrishami, S. M. Fakhraie, "New approximate multiplier for low power digital signal processing", Computer Architecture and Digital Systems (CADS), pp. 25--30, 2013.Google Scholar
- D. Mohapatra, V. K. Chippa, A. Raghunathan, K. Roy, "Design of voltage-scalable meta-functions for approximate computing", Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011.Google ScholarCross Ref
- S. G. Ramasubramanian, S. Venkataramani, A. Parandhaman, A. Raghunathan, "Relax-and-retime: A methodology for energy-efficient recovery based design", ACM/IEEE Design Automation Conference (DAC), 2013. Google ScholarDigital Library
- Z. Wang, A. Bovik, H. Sheikh, E. P. Simoncelli, "The SSIM index for image quality assessment", IEEE Transaction on Image Processing, vol, 13, no. 4, 2004.Google Scholar
- S. Mazahir, O. Hasan, R. Hafiz, M. Shafique, J. Henkel, "An Area-Efficient Consolidated Configurable Error Correction for Approximate Hardware Accelerators", IEEE/ACM Design Automation Conference (DAC), 2016. Google ScholarDigital Library
- D. Palomino, M. Shafique, A. Susin, J. Henkel, "Thermal Optimization using Adaptive Approximate Computing for Video Coding", Design, Automation & Test in Europe Conference & Exhibition (DATE), 2016.Google Scholar
- F. Sampaio, M. Shafique, B. Zatt, S. Bampi, J. Henkel, "Approximation-Aware Multi-Level Cells STT-RAM Cache Architecture", IEEE International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), 2015. Google ScholarDigital Library
- S. Sarwar, G. Srinivasan, S. Venkataramani, A. Sengupta, A. Raghunathan, K. Roy, "Cross-Layer Approximations for Neuromorphic Computing: From Devices to Circuits and Systems", ACM/IEEE Design Automation Conference (DAC), 2016. Google ScholarDigital Library
- D. M. Mathew, C. Weis, N. Wehn, "Approximate Computing with Partially Unreliable Dynamic Random Access Memory: Approximate DRAM", ACM/IEEE Design Automation Conference (DAC), 2016. Google ScholarDigital Library
- T. Mytkowicz, "Programming Uncertain Things", ACM/IEEE Design Automation Conference (DAC), (Presentation Only) 2016.Google Scholar
- J. Henkel, "Approximate Computing: Solving Computing's Inefficiency Problem?", IEEE Design and Test, 2016.Google Scholar
- D. Jeong, Y. H. Oh, J. W. Lee and Y. Park, "An eDRAM-Based Approximate Register File for GPUs," in IEEE Design & Test, vol. 33, no. 1, pp. 23--31, Feb. 2016.Google ScholarCross Ref
- A. Yazdanbakhsh, B. Thwaites, H. Esmaeilzadeh, G. Pekhimenko, O. Mutlu and T. C. Mowry, "Mitigating the Memory Bottleneck With Approximate Load Value Prediction," in IEEE Design & Test, vol. 33, no. 1, pp. 32--42, Feb. 2016.Google ScholarCross Ref
- D. S. Khudia, B. Zamirai, M. Samadi and S. Mahlke, "Quality Control for Approximate Accelerators by Error Prediction," in IEEE Design & Test, vol. 33, no. 1, pp. 43--50, Feb. 2016.Google ScholarCross Ref
- B. Li, P. Gu, Y. Wang and H. Yang, "Exploring the Precision Limitation for RRAM-Based Analog Approximate Computing," in IEEE Design & Test, vol. 33, no. 1, pp. 51--58, Feb. 2016.Google ScholarCross Ref
- Invited - Cross-layer approximate computing: from logic to architectures
Recommendations
A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits
Often as the most important arithmetic modules in a processor, adders, multipliers, and dividers determine the performance and energy efficiency of many computing tasks. The demand of higher speed and power efficiency, as well as the feature of error ...
Threshold logic circuit design of parallel adders using resonant tunneling devices
Special issue on the 11th international symposium on system-level synthesis and design (ISSS'98)Resonant tunneling devices and circuit architectures based on monostable-bistable transition logic elements (MOBILEs) are promising candidates for future nanoscale integration. In this paper, the design of clocked MOBILE-type threshold logic gates and ...
Reduced Latency IEEE Floating-Point Standard Adder Architectures
ARITH '99: Proceedings of the 14th IEEE Symposium on Computer ArithmeticThe design and implementation of a double precision floating-point IEEE-754 standard adder is described which uses "flagged prefix addition" to merge rounding with the significand addition. The floating-point adder is implemented in 0.5um CMOS, measures ...
Comments