- 1.J. Hennessy and D. A. Patterson, "Computer Architecture: A Quantitative Approach," Morgan Kaufmann Publishers, San Mateo, CA, Second Edition, 1996. Google ScholarDigital Library
- 2.A. Abnous and N. Bagherzadeh, "Pipelining and Bypassing in a VLIW Processor," IEEE Trans. on Parallel and Distributed Systems, Vol. 5, No. 6, June 1994, pp. 658-663. Google ScholarDigital Library
- 3.H. Corporaal, "Microprocessor Architectures: from VLIW to TTA," John Wiley and Sons, England, 1997. Google ScholarDigital Library
- 4.R. Yung and N. C. Wilhelm, "Caching Processor General Registers," ICCD '95. Proceedings of IEEE International Conference on Computer Design, 1995, pp. 307-312. Google ScholarDigital Library
- 5.M. M. Martin, A. Roth, C. N. Fischer, "Exploiting Dead Value Information," MICRO-30, Proceedings of 30th Annual IEEE/ACM International Symposium on Microarchitecture, 1997, pp. 125-135. Google ScholarDigital Library
- 6.A. Chandrakasan and R. Brodersen, "Minimizing Power Consumption in Digital CMOS Circuits," Proc. of IEEE, 83(4), pp. 498-523, 1995.Google ScholarCross Ref
- 7.K. Roy, S. C. Prasad "Low-Power CMOS VLSI Circuit Design," John Wiley and Sons, Inc., Wiley-Interscience, 2000.Google Scholar
- 8.V. Zyuban and P. Kogge, "The Energy Complexity of Register Files," ISLPED98, Proceedings of International Symposium on Low-Power Electronic Design, Monterey, CA-USA, 1998, pp. 305-310. Google ScholarDigital Library
- 9.A. V. Aho, R. Sethi, J. D. Ullman, "Compilers: Principles, Techniques, and Tools," Addison-Wesley, 1986. Google ScholarDigital Library
- 10.P. Faraboschi, G. Brown, J. A. Fisher, G. Desoli, and F. Homehood, "Lx: A Technology Platform for Customizable VLIW Embedded Processing," ISCA00, Proceedings of International Symposium on Computer Architecture, Vancouver, BC, Canada, 2000, pp. 203-213. Google ScholarDigital Library
- 11.J. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction" IEEE Trans. on Computers, C-30(7):478- 490. 1981.Google ScholarDigital Library
- 12.W. H. Chen, C. H. Smith and S. C. Fralick "A Fast Computational Algorithm For The Discrete Cosine Transform" IEEE Trans. Commun. vol. COM-25, pp. 1004-1009, Sept 1977.Google Scholar
- 13.W. H. Press, S. A. Teukolsky,W. T. Vetterling, B. P. Flannery "Numerical Recipesin C :TheArt of Scientific Computing " Cambridge Univ Press. Jan 1993. Google ScholarDigital Library
Index Terms
- Exploiting data forwarding to reduce the power budget of VLIW embedded processors
Recommendations
Low-power data forwarding for VLIW embedded architectures
Proposes a low-power approach to the design of embedded very long instruction word (VLIW) processor architectures based on the forwarding (or bypassing) hardware, which provides operands from interstage pipeline registers directly to the inputs of the ...
Enabling compiler flow for embedded VLIW DSP processors with distributed register files
LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsHigh-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and multimedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank ...
Hybrid multithreading for VLIW processors
CASES '09: Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systemsSeveral multithreading techniques have been proposed to reduce resource underutilization in Very Long Instruction Word (VLIW) processors. Simultaneous MultiThreading (SMT) is a popular technique that improves processor performance by issuing multiple ...
Comments