ABSTRACT
In this paper, a novel Greybox design methodology is proposed to establish a design and co-optimization flow across the boundary of conventional software and hardware design. The dynamic timing of each software instruction is simulated and associated with processor hardware design, which provides the basis of ultra-dynamic clock management. The proposed scheme effectively implements the instruction-based clock management and achieves 21.71% frequency speedup. Besides, a novel program-driven hardware optimization flow is proposed, in which software operations are mapped with hardware gate netlist and sorted by the usage frequency. The experiments on an ARM based pipeline design in commercial 65nm CMOS process show an extra 10% frequency speedup is obtained with high optimization efficiency. Overall, the proposed Greybox design method achieves frequency speedup by 31.56%, comparing with conventional design method.
- J. Howard, et al., "A 48-core IA-32 processor in 45 nm CMOS using on-die message-passing and DVFS for performance and power scaling", IEEE Journal of Solid-State Circuits, vol. 46, no. 1, pp. 173--183, Jan. 2011.Google ScholarCross Ref
- Z. Toprak-Deniz, et al., "Distributed system of digitally controlled microregulators enabling per-core DVFS for the POWER8 microprocessor", International Solid-State Circuits Conference (ISSCC), pp. 98--99, Feb. 2014.Google Scholar
- S. Kim, et al., "Enabling wide autonomous DVFS in a 22 nm graphics execution core using a digitally controlled fully integrated voltage regulator", IEEE Journal of Solid-State Circuits, vol. 51, no. 1, pp. 18--30, Jan. 2016.Google ScholarCross Ref
- T. Jia, et al., "Exploration of associative power management with instruction governed operation for ultra-low power design", Design Automation Conference (DAC), 2016. Google ScholarDigital Library
- J. Xin, et al., "Identifying and predicting timing-critical instructions to boost timing speculation", International Symposium on Microarchitecture (MICRO), pp. 74--85, 2011. Google ScholarDigital Library
- J. Constantin, et al., "Exploiting dynamic timing margins in microprocessors for frequency-over-scaling with instruction-based clock adjustment", Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015. Google ScholarDigital Library
- H. Cherupalli, et al., "Exploiting dynamic timing slack for energy efficiency in ultra-low-power embedded systems", International Symp. on Computer Architecture (ISCA), 2016. Google ScholarDigital Library
- S. Das, et al., "RazorII: in situ error detection and correction for PVT and SER tolerance", IEEE Journal of Solid-State Circuits, vol. 44, no. 1, pp. 32--48, Jan. 2009.Google ScholarCross Ref
- Online resource, ARM, "ARMv5 Architecture Reference Manual", https://silver.arm.com/download/download.tm?pv=1073121Google Scholar
- Online resource, http://www.gem5.org/Main_PageGoogle Scholar
- J. Henning, et al., "SPEC CPU2006 benchmark descriptions", Computer Architecture News, 34(4), Sep. 2006. Google ScholarDigital Library
- A. Meixner, et al., "Argus: Low-cost, comprehensive error detection in simple cores", International Symposium on Microarchitecture (MICRO), pp. 210--222, 2007. Google ScholarDigital Library
- N. August, et al, "A TDC-less ADPLL with 200-to-3200MHz range and 3mW power dissipation for mobile SoC clocking in 22nm CMOS", International Solid-State Circuits Conference (ISSCC), Feb. 2012.Google Scholar
- M. Perrott, "Tutorial on digital phase-locked loops", Custom Integrated Circuits Conference (CICC), 2009.Google Scholar
- U. Moon, et al., "Spectral analysis of time-domain phase jitter measurement", IEEE Transactions on Circuits and System II, vol. 49, no. 5, pp. 321--327, 2002.Google ScholarCross Ref
Recommendations
Development Methodology of ASIP Based on Java Byte Code Using HW/SW Co-Design System for Processor Design
ICDCSW '04: Proceedings of the 24th International Conference on Distributed Computing Systems Workshops - W7: EC (ICDCSW'04) - Volume 7To develop an ASIP (Application Specific Instruction setProcessor), development of HW (hardware) and developmentof SWDE (software development environments) arerequired. Separate develops of HW and SWDE in a shorttime are difficult. So HW/SW co-design ...
Design Methodology for Offloading Software Executions to FPGA
Field programmable gate array (FPGA) is a flexible solution for offloading part of the computations from a processor. In particular, it can be used to accelerate an execution of a computationally heavy part of the software application, e.g., in DSP, ...
Multi-methodology design: an experimental comparison
IVC '96: Proceedings of the 1996 IEEE International Verilog HDL Conference (IVC '96)The paper presents a multi-methodology design process model incorporating multiple design approaches. Design productivity is quantified by measuring effort (time) required for various activities in HDL-based design. An experimental comparison is carried ...
Comments