skip to main content
10.1145/2987386.2987438acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

Parallel Document Inversion using GPU

Published: 11 October 2016 Publication History

Abstract

Recent advances in the technology of the Graphics Processing Unit (GPU) has led to a surge of interest in using the GPU for general purpose applications. We can utilize the GPU in computation as a massive parallel co-processor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Although the inverted index is a useful data structure that can be used for full text search or document retrieval, the large number of documents will require tremendous time to create the index. The performance of document inversion can be improved by multicore GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing.

References

[1]
Ankur Narang, V.A., Monu Kedia, Vijay K. Garg, 2009. Highly scalable algorithm for distributed real-time text indexing. In High Performance Computing (HiPC) IEEE, Kochi, 332--341.
[2]
Atallah, M.J. and Fox, S., 1998. Algorithms and Theory of Computation Handbook. CRC Press, Inc., Boca Raton, FL, USA.
[3]
Baeza-Yates, R. and Ribeiro-Neto, B., 1999. Modern Information Retrieval. Addison Wesley.
[4]
Czech, Z.J., Havas, G., and Majewski, B.S., 1997. Perfect Hashing. Theor. Comput. Sci. 182, 1-2, 1--143.
[5]
Frumkin, M., 2014. Indexing text documents on gpu - Can you index the web in real time? NVIDIA, GPU Technology Conference.
[6]
Harris, M., 2016. Inside Pascal: NVIDIA's Newest Computing Platform Nvidia.
[7]
Manning, C.D., Raghavan, P., and Sch\"u, T., Hinrich, 2008. Introduction to Information Retrieval. Cambridge University Press.
[8]
Marziale, L., III, G.G.R., and Roussev, V., 2007. Massive Threading: Using GPUs to Increase the Performance of Digital Forensics Tools. In DFRWS 2007:Proceedings of the 7th Annual Digital Forensics Research Workshop, Pittsburgh, PA, 1:73.81.
[9]
Mcilroy, P.M., Bostic, K., and Mcilroy, M.D., 1993. Engineering Radix Sort. COMPUTING SYSTEMS 6, 5--27.
[10]
Naghmouchi, J., Scarpazza, D.P., and Berekovic, M., 2010. Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization. In Proceedings of the Proceedings of the 24th ACM International Conference on Supercomputing (Tsukuba, Ibaraki, Japan2010), ACM, 1810130, 337--348. DOI= http://dx.doi.org/10.1145/1810085.1810130.
[11]
Porter, M.F., 1980. An Algorithm for Suffix Stripping. Program 14, 3, 130--137.
[12]
Ryoo, S., 2008. Program Optimization Strategies for Data-Parallel Many-Core Processors University of Illinois, Urbana, IL.
[13]
Scarpazza, D.P. and Braudaway, G.W., 2009. Workload characterization and optimization of high-performance text indexing on the Cell Broadband Engine. In IISWC, 13--23.
[14]
Seltzer, M., 1991. A New Hashing Package for UNIX.
[15]
Sophoclis, N.N., Abdeen, M., El-Horbaty, E.S.M., and Yagoub, M., 2012. A novel approach for indexing Arabic documents through GPU computing. In Electrical & Computer Engineering (CCECE), 2012 25th IEEE Canadian Conference on, 1--4. DOI= http://dx.doi.org/10.1109/CCECE.2012.6334963.
[16]
Yamada, H. and Toyama, M., 2010. Scalable online index construction with multi-core CPUs. In Proceedings of the Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104 (Brisbane, Australia2010), Australian Computer Society, Inc., 1862249, 29--36.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent Systems
October 2016
266 pages
ISBN:9781450344555
DOI:10.1145/2987386
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2016

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Paper

Author Tags

  1. GPU
  2. Graphics Processing Unit
  3. document inversion
  4. high-performance computing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • NIH National Institute for General Medical Science

Conference

RACS '16
Sponsor:

Acceptance Rates

RACS '16 Paper Acceptance Rate 40 of 161 submissions, 25%;
Overall Acceptance Rate 393 of 1,581 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 102
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media