ABSTRACT
This paper presents a unified processor core with two operation modes. The processor core works as a compiler-friendly MIPS-like core in the RISC mode, and it is a 4-way VLIW in its DSP mode, which has distributed and ping-pong register organization optimized for stream processing. To minimize hardware, the DSP mode has no control construct for program flow, while the data manipulation RISC instructions are executed in the DSP datapath. Moreover, the two operation modes can be changed instruction by instruction within a single program stream via the hierarchical instruction encoding, which also helps to reduce the VLIW code sizes significantly. The processor has been implemented in the UMC 0.18um CMOS technology, and its core size is 3.23mmx3.23mm including the 32KB on-chip memory. It can operate at 208MHz while consuming 380.6mW average power.
- Intel PXA800F Cellular Processor - Development Manual, Intel Corp., Feb. 2003.Google Scholar
- OMAP5910 Dual Core Processor - Technical Reference Manual, Texas Instruments, Jan. 2003.Google Scholar
- M. Levy, "ARM picks up performance," Microprocessor Report, 4/7/03-01.Google Scholar
- R. A. Quinnell, "Logical combination? Convergence products need both RISC and DSP processors, but merging them may not be the answer," EDN, 1/23/2003.Google Scholar
- J. L Hennessy, and D. A. Patterson, Computer Architecture - A Quantitative Approach, 3rd Edition, Morgan Kaufmann, 2002. Google ScholarDigital Library
- S. Rixner, W. J. Dally, B. Khailany, P. Mattson, U. J. Kapasi, and J. D. Owens, "Register organization for media processing," in Proc. HPCA-6, 2000, pp.375--386.Google Scholar
- J. Zalamea, J. Llosa, E. Ayguade, and M. Valero, "Hierarchical clustered register file organization for VLIW processors," in Proc. IPDPS, 2003, pp.77--86. Google ScholarDigital Library
- P. Faraboschi, G. Brown, J. A. Fisher, G. Desoll, and F. M. O. Homewood, "Lx: a technology platform for customizable VLIW embedded processing," in Proc. ISCA, 2000, pp.203--213. Google ScholarDigital Library
- E. F. Barry, G. G. Pechanek, and P. R. Marchand, "Register file indexing methods and apparatus for providing indirect control of register file addressing in a VLIW processor," International Application Published under the Patent Cooperation Treaty (PCT), WO 00/54144, Mar. 9 2000.Google Scholar
- TMS320C64x DSP Library Programmer's Reference, Texas Instruments Inc., Apr 2002.Google Scholar
- K. Arora, H. Sharangpani, and R. Gupta, "Copied register files for data processors having many execution units" U.S. Patent 6,629,232, Sep. 30, 2003.Google Scholar
- A. Kowalczyk et al., "The first MAJC microprocessor: a dual CPU system-on-a-chip," IEEE J. Solid-State Circuits, vol. 36, pp.1609--1616, Nov. 2001.Google ScholarCross Ref
- A. Terechko, E. L. Thenaff, M. Garg, J. Eijndhoven, and H. Corporaal, "Inter-cluster communication models for clustered VLIW processors," in Proc. HPCA-9, 2003, pp.354--364. Google ScholarDigital Library
- H. Pan and K. Asanovic, "Heads and tails: a variable-length instruction format supporting parallel fetch and decode," in Proc. CASES, 2001. Google ScholarDigital Library
- G. G. Pechanek and S. Vassiliadis, "The ManArray embedded processor architecture," Euromicro Conf., vol.1, pp.348--355, Sep., 2000.Google Scholar
- G. Fettweis, M. Bolle, J. Kneip, and M. Weiss, "OnDSP: a new architecture for wireless LAN applications," Embedded Processor Forum, May 2002.Google Scholar
- TMS320C55x DSP Programmer's Guide, Texas Instruments Inc., July 2000.Google Scholar
- T. Kumura, M. Ikekawa, M. Yoshida, and I. Kuroda, "VLIW DSP for mobile applications," IEEE Signal Processing Mag., pp.10--21, July 2002.Google ScholarCross Ref
- R. K. Kolagotla, et al, "A 333-MHz dual-MAC DSP architecture for next-generation wireless applications," in Proc. ICASSP, 2001, pp.1013--1016. Google ScholarDigital Library
- W. B. Pennebaker and J. L. Mitchell, JPEG - Still Image Data Compression Standard, Van Nostrand Reinhold, 1993. Google ScholarDigital Library
- T. J. Lin, C. C. Chang, C. C. Lee, and C. W. Jen, "An efficient VLIW DSP architecture for baseband processing," in Proc. ICCD, 2003. Google ScholarDigital Library
- P. Lapsley, J. Bier, A. Shoham, and E. A. Lee, DSP Processor Fundamentals - Architectures and Features, IEEE Press, 1996. Google ScholarDigital Library
- TriCore 2-32-bit Unified Processor Core v.2.0 Architecture - Architecture Manual, Infineon Technology, June 2003.Google Scholar
- Y. H. Hu, Programmable Digital Signal Processors - Architecture, Programming, and Applications, Marcel Dekker Inc., 2002.Google Scholar
Index Terms
- A unified processor architecture for RISC & VLIW DSP
Recommendations
Design and Implementation of a High-Performance and Complexity-Effective VLIW DSP for Multimedia Applications
This paper presents the design and implementation of a novel VLIW digital signal processor (DSP) for multimedia applications. The DSP core embodies a distributed & ping-pong register file, which saves 76.8% silicon area and improves 46.9% access time of ...
Code generation for an application-specific VLIW processor with clustered, addressable register files
ODES '13: Proceedings of the 10th Workshop on Optimizations for DSP and Embedded SystemsModern compilers integrate recent advances in compiler construction, intermediate representations, algorithms and programming language front-ends. Yet code generation for application-specific architectures benefits only marginally from this trend, as ...
A Compiler-Friendly RISC-Based Digital Signal Processor Synthesis and Performance Evaluation
As DSP (Digital Signal Processing) applications become more complex, there is also a growing need for new architectures supporting efficient high-level language compilers. We try to synthesize a new DSP processor architecture by adding several DSP ...
Comments