ABSTRACT
Heteregenous multiprocessor SoCs are becoming a reality, largely due to the abundance of transistors, intellectual property cores and powerful design tools. In this project, we explore the use of multiple cores to speed up the JPEG compression algorithm. We show two methods to parallelize this algorithm: one, a master-slave model; and two, a pipeline model. The systems were implemented using Tensilica's Xtensa LX processors with queues. We show that even with this relatively simple application, parallelization can be carried out with up to nine processors with utilization of between 50% to 80%. We obtained speed ups of up to 4.6X with a seven core system with an area increase of 3.1X.
- SystemC Initiative. (http://www.systemc.org).Google Scholar
- Xtensa Processor. Tensilica Inc. (http://www.tensilica.com).Google Scholar
- Flix: Fast relief for performance-hungry embedded applications. Tensilica Inc. (http://www.tensilica.com/pdf/FLIX_White_Paper_v2.pdf), 2005.Google Scholar
- J. Axelsson. A Case Study in Heterogeneous Implementation of Automotive Real-Time Systems. In CODES'98, Seattle, 1998.Google Scholar
- S. Banerjee, T. Hamada, P. M. Chau, and R. D. Fellman. Macro Pipelining Based Scheduling on High Performance Heterogeneous Multiprocessor Systems. Signal Processing, IEEE Transactions on, 43(6):1468--1484, 1995. Google ScholarDigital Library
- S. Baruah. Task partitioning upon heterogeneous multiprocessor platforms. In RTAS'04, pages 536--543, 2004. Google ScholarDigital Library
- A. Berić, R. Sethuraman, C. A. Pinto, H. Peters, G. Veldman, P. van de Haar, and M. Duranton. Heterogeneous Multiprocessor for High Definition Video. In ICCE'06, pages 401--402, 2006.Google Scholar
- T. D. Braun, H. J. Siegel, and A. A. Maciejewski. Heterogeneous computing: Goals, methods, and open problems. In HiPC 2001, volume 2228, pages 302--320, Hyderabad, India, 2001. Springer. Google ScholarDigital Library
- K. S. Chatha and R. Vemuri. A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems. In ISSS'98, pages 145--151, Hsinchu, 1998. Google ScholarDigital Library
- S. Gopalakrishnan and M. Caccamo. Task Partitioning with Replication upon Heterogeneous Multiprocessor Systems. In RTAS'06, pages 199--207, 2006. Google ScholarDigital Library
- E. Hamilton. JPEG File Interchange Format. Technical report, C-Cube Microsystems, September 1 1992.Google Scholar
- J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, 3rd edition, 2003. Google ScholarDigital Library
- J. Jeon and K. Choi. Loop Pipelining in Hardware-Software Partitioning. In ASP-DAC'98, pages 361--366, Yokohama, Japan, 1998.Google Scholar
- M. Kim, D. Kim, and G. E. Sobelman. MPEG-4 performance analysis for a CDMA network-on-chip. In ICCCAS'05, pages 493--496, 2005.Google Scholar
- T. Kodaka, K. Kimura, and H. Kasahara. Multigrain Parallel Processing for JPEG Encoding on a Single Chip Multiprocessor. In IWIA'02, pages 57--63, 2002. Google ScholarDigital Library
- R. Kumar, D. Tullsen, N. Jouppi, and P. Ranganathan. Heterogeneous Chip Multiprocessors. Computer, 38(11):32--38, November 2005. Google ScholarDigital Library
- D. e. a. Pham. The design and implementation of a first-generation cell processor. In ISSCC 2005, pages 184--186. IEEE CS Press, 2005.Google ScholarCross Ref
- M. T. J. Strik, A. H. Timmer, J. L. van Meerbergen, and G.-J. van Rootselaar. Heterogeneous multiprocessor for the management of real-time video and graphics streams. Solid-State Circuits, IEEE Journal of, 35(11):1722--1731, 2000.Google Scholar
- F. Sun, S. Ravi, A. Raghunathan, and N. K. Jha. Custom-instruction synthesis for extensible-processor platforms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 23(2):216--228, 2004. Google ScholarDigital Library
- V. Živojnović, S. Pees, and H. Myer. LISA-machine description language and generic machine model for HW/SW co-design. In Workshop on VLSI Signal Processing, pages 127--136, 1996.Google ScholarCross Ref
- A. Wieferink, M. Doerper, R. Leupers, G. Ascheid, H. Meyr, T. Kogel, G. Braun, and A. Nohl. System Level Processor/Communication Co-exploration Methodology for Multiprocessor System-on-Chip Platforms. Computers and Digital Techniques, IEE Proceedings, 152(1):3--11, 2005.Google Scholar
- N. Zhang and C.-H. Wu. Study on Adaptive Job Assignment for Multiprocessor Implementation of MPEG2 Video Encoding. Industrial Electronics, IEEE Transactions on, 44(5):726--734, 1997.Google Scholar
Index Terms
- Heterogeneous multiprocessor implementations for JPEG:: a case study
Recommendations
Dynamic partitioning-based JPEG decompression on heterogeneous multicore architectures
With the emergence of social networks and improvements in computational photography, billions of JPEG images are shared and viewed on a daily basis. Desktops, tablets, and smartphones constitute the vast majority of hardware platforms used for ...
Dynamic Partitioning-based JPEG Decompression on Heterogeneous Multicore Architectures
PMAM'14: Proceedings of Programming Models and Applications on Multicores and ManycoresWith the emergence of social networks and improvements in computational photography, billions of JPEG images are shared and viewed on a daily basis. Desktops, tablets and smartphones constitute the vast majority of hardware platforms used for displaying ...
Heterogeneous acceleration of volumetric JPEG 2000 using OpenCL
This paper discusses an OpenCL version of a volumetric JPEG 2000 codec that runs on GPUs, multi-core processors or a combination of both. Since the performance critical part consists of a fine-grained discrete wavelet transform and coarse-grained ...
Comments