ABSTRACT
Efficient lookups in huge, possibly multi-dimensional datasets are crucial for the performance of numerous use cases that generate multiple search operations at the same time, like point queries in ray tracing or spatial joins in collision detection of interactive 3D applications. These applications greatly benefit from index structures that quickly filter relevant candidates for further processing. Since different lookup operations are independent from each other, they might be processed in parallel on modern hardware like multi-core CPUs or GPUs. But implementing efficient algorithms for all kinds of indexes on various hardware platforms is a challenging task. In this paper, we present a new approach that extends the existing GiST index framework with an abstraction layer for the hardware where index operations are executed. Furthermore, we provide first performance evaluations for the scan execution on CPUs and an Nvidia Tesla GPU.
- A. Ailamaki. Managing scientific data: lessons, challenges, and opportunities. In Proceedings of the 2011 international conference on Management of data, SIGMOD '11, pages 1045--1046, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- W. G. Aref and I. F. Ilyas. Sp-gist: An extensible database index for supporting space partitioning trees. Journal of Intelligent Information Systems, 17: 215--240, 2001. 10.1023/A:1012809914301. Google ScholarDigital Library
- D. Comer. Ubiquitous b-tree. ACM Comput. Surv., 11(2): 121--137, June 1979. Google ScholarDigital Library
- W. T. Correa, J. T. Klosowski, and C. T. Silva. Visibility-based prefetching for interactive out-of-core rendering. In Proceedings of the 2003 IEEE Symposium on Parallel and Large-Data Visualization and Graphics, PVG '03, pages 2--, Washington, DC, USA, 2003. IEEE Computer Society. Google ScholarDigital Library
- C. Gregg and K. Hazelwood. Where is the data? why you cannot debate cpu vs. gpu performance without the answer. In Performance Analysis of Systems and Software (ISPASS), 2011 IEEE International Symposium on, pages 134--144, april 2011. Google ScholarDigital Library
- A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proceedings of the 1984 ACM SIGMOD international conference on Management of data, SIGMOD '84, pages 47--57, New York, NY, USA, 1984. ACM. Google ScholarDigital Library
- M. Harris, S. Sengupta, and J. D. Owens. Parallel Prefix Sum (Scan) with CUDA. In H. Nguyen, editor, GPU Gems 3. Addison Wesley, Aug. 2007.Google Scholar
- B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, SIGMOD '08, pages 511--524, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- J. M. Hellerstein, J. F. Naughton, and A. Pfeffer. Generalized search trees for database systems. In Proceedings of the 21th International Conference on Very Large Data Bases, VLDB '95, pages 562--573, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. Fast: fast architecture sensitive tree search on modern cpus and gpus. In Proceedings of the 2010 international conference on Management of data, SIGMOD '10, pages 339--350, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- W.-S. Kim, W.-K. Loh, and W.-S. Han. Cc-gist: Cache conscious-generalized search tree for supporting various fast intelligent applications. In S. Mehrotra, D. Zeng, H. Chen, B. Thuraisingham, and F.-Y. Wang, editors, Intelligence and Security Informatics, volume 3975 of Lecture Notes in Computer Science, pages 657--658. Springer Berlin / Heidelberg, 2006. Google ScholarDigital Library
- V. W. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. D. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, R. Singhal, and P. Dubey. Debunking the 100x gpu vs. cpu myth: an evaluation of throughput computing on cpu and gpu. In Proceedings of the 37th annual international symposium on Computer architecture, ISCA '10, pages 451--460, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- M. Moore and J. Wilhelms. Collision detection and response for computer animation. In Proceedings of the 15th annual conference on Computer graphics and interactive techniques, SIGGRAPH '88, pages 289--298, New York, NY, USA, 1988. ACM. Google ScholarDigital Library
- A. Munshi. The OpenCL Specification, 2011.Google Scholar
- C. Nvidia. Nvidia cuda. Changes, 7(6(36)): 187, 2011.Google Scholar
- P. B. Volk, D. Habich, and W. Lehner. Gpu-based speculative query processing for database operations. In Proceedings of the 1st International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, 2010.Google Scholar
- T. Whitted. An improved illumination model for shaded display. In ACM SIGGRAPH 2005 Courses, SIGGRAPH '05, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- D. Wu, F. Zhang, N. Ao, G. Wang, X. Liu, and J. Liu. Efficient lists intersection by cpu-gpu cooperative computing. In Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, pages 1--8, april 2010.Google Scholar
- X. Xiao, T. Shi, P. Vaidya, and J. J. Lee. R-tree: A hardware implementation. In CDES'08, pages 3--9, 2008.Google Scholar
Recommendations
Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors
IPDPSW '13: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD ForumIntel® Xeon Phi™ coprocessor is based on the Intel® Many Integrated Core (Intel® MIC) architecture, which is an innovative new processor architecture that combines abundant thread parallelism with long SIMD vector units. Efficiently exploiting SIMD ...
Effective SIMD vectorization for intel Xeon Phi coprocessors
Special issue on Programming Models, Languages, and Compilers for Manycore and Heterogeneous ArchitecturesEfficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel Xeon Phi coprocessors. In this paper, we present several effective SIMD vectorization techniques such as ...
Gillespie's Stochastic Simulation Algorithm on MIC coprocessors
To investigate the behavior of biochemical systems, many runs of Gillespie's Stochastic Simulation Algorithm (SSA) are generally needed, causing excessive computational costs on Central Processing Units (CPUs). Since all SSA runs are independent, the ...
Comments