skip to main content
10.1145/1772690.1772696acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Published: 26 April 2010 Publication History

Abstract

The Semantic Web community, until now, has used traditional database systems for the storage and querying of RDF data. The SPARQL query language also closely follows SQL syntax. As a natural consequence, most of the SPARQL query processing techniques are based on database query processing and optimization techniques. For SPARQL join query optimization, previous works like RDF-3X and Hexastore have proposed to use 6-way indexes on the RDF data. Although these indexes speed up merge-joins by orders of magnitude, for complex join queries generating large intermediate join results, the scalability of the query processor still remains a challenge.
In this paper, we introduce (i) BitMat - a compressed bit-matrix structure for storing huge RDF graphs, and (ii) a novel, light-weight SPARQL join query processing method that employs an initial pruning technique, followed by a variable-binding-matching algorithm on BitMats to produce the final results. Our query processing method does not build intermediate join tables and works directly on the compressed data. We have demonstrated our method against RDF graphs of upto 1.33 billion triples - the largest among results published until now (single-node, non-parallel systems), and have compared our method with the state-of-the-art RDF stores - RDF-3X and MonetDB. Our results show that the competing methods are most effective with highly selective queries. On the other hand, BitMat can deliver 2-3 orders of magnitude better performance on complex, low-selectivity queries over massive data.

References

[1]
Jena TDB. http://jena.sourceforge.net/TDB/.
[2]
D. J. Abadi, S. R. Madden, and M. C. Ferreira. Integrating Compression and Execution in Column Oriented Database Systems. In SIGMOD, 2006.
[3]
D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable Semantic Web Data Management using Vertical Partitioning. In PVLDB, 2007.
[4]
D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. R. Madden. Materialization Strategies in a Column Oriented DBMS. In ICDE, 2007.
[5]
M. Atre and J. A. Hendler. BitMat: A Main-memory Bit-Matrix of RDF Triples. In SSWS workshop at ISWC, 2009.
[6]
P. A. Bernstein and D.-M. W. Chiu. Using semi-joins to solve relational queries. Journal of the ACM, 28(1), 1981.
[7]
P. A. Bernstein and N. Goodman. Power of natural semijoins. SIAM Journal of Computing, 10(4), 1981.
[8]
M. Cai and M. Frank. RDFPeers: A Scalable Distributed RDF Repository based on a Structured Peer-to-Peer Network. In WWW, 2004.
[9]
R. Cyganiak. A Relational Algebra for SPARQL. Technical Report, HP Laboratories Bristol, (HPL-2005-170), 2005.
[10]
O. Erling. Virtuoso, 2006. http://virtuoso.openlinksw.com/wiki/main/Main/VOSBitmapIndexing.
[11]
M. Janik and K. Kochut. BRAHMS: A WorkBench RDF Store and High Performance Memory System for Semantic Association Discovery. In ISWC, 2005.
[12]
T. Johnson. Performance Measurements of Compressed Bitmap Indices. In PVLDB, 1999.
[13]
LUBM. http://swat.cse.lehigh.edu/projects/lubm/.
[14]
A. Matono, S. M. Pahlevi, and I. Kojima. RDFCube: A P2P-based Three-dimensional Index for Structural Joins on Distributed Triple Stores. In DBISP2P at VLDB, 2006.
[15]
T. Neumann and G. Weikum. RDF3X: a RISC style Engine for RDF. In PVLDB, 2008.
[16]
T. Neumann and G. Weikum. Scalable join processing on very large RDF graphs. In SIGMOD, 2009.
[17]
P. O'Neil and G. Graefe. Multi-Table Joins Through Bitmapped Join Indices. In SIGMOD Record, volume 24, September 1995.
[18]
OpenRDF LUBM Queries. http://repo.adunasoftware.org/viewvc/org.openrdf/?pathrev=6875.
[19]
UniProt RDF Queries. http://dev.isbsib.ch/projects/expasy4j-webng/query.html#examples.
[20]
S. Sarawagi. Indexing OLAP data. Data Engineering Bulletin, 20(1), 1997.
[21]
L. Sidirourgos, R. Goncalves, M. Kersten, et al. Column-store Support for RDF Data Management: not all swans are white. In PVLDB, 2008.
[22]
O. Udrea, A. Pugliese, and V. Subrahmanian. GRIN: A Graph Based RDF Index. In AAAI, 2007.
[23]
UniProt RDF. http://dev.isb-sib.ch/projects/uniprot-rdf/.
[24]
C. Weiss, P. Karras, and A. Bernstein. Hexastore: Sextuple Indexing for Semantic Web Data Management. In PVLDB, 2008.

Cited By

View all
  • (2024)The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra SpaceACM Transactions on Database Systems10.1145/364482449:2(1-45)Online publication date: 23-Mar-2024
  • (2024)Locality-Preserving Graph Traversal With Split Live MigrationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.343682835:10(1810-1825)Online publication date: Oct-2024
  • (2024)Compressed and queryable self-indexes for RDF archivesKnowledge and Information Systems10.1007/s10115-023-01967-766:1(381-417)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '10: Proceedings of the 19th international conference on World wide web
April 2010
1407 pages
ISBN:9781605587998
DOI:10.1145/1772690

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data compression
  2. query algorithm
  3. rdf store

Qualifiers

  • Research-article

Conference

WWW '10
WWW '10: The 19th International World Wide Web Conference
April 26 - 30, 2010
North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra SpaceACM Transactions on Database Systems10.1145/364482449:2(1-45)Online publication date: 23-Mar-2024
  • (2024)Locality-Preserving Graph Traversal With Split Live MigrationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.343682835:10(1810-1825)Online publication date: Oct-2024
  • (2024)Compressed and queryable self-indexes for RDF archivesKnowledge and Information Systems10.1007/s10115-023-01967-766:1(381-417)Online publication date: 1-Jan-2024
  • (2024)PathBit: A Bit Index Based on Path for Large-Scale Knowledge GraphQuality, Reliability, Security and Robustness in Heterogeneous Systems10.1007/978-3-031-65126-7_18(181-195)Online publication date: 20-Aug-2024
  • (2024)Protocol Conformance of Collaborative SPARQL Using Multiparty Session TypesTheoretical Aspects of Software Engineering10.1007/978-3-031-64626-3_1(1-18)Online publication date: 14-Jul-2024
  • (2022)A SPARQL benchmark for distributed databases in IoT environmentsProceedings of the International Workshop on Big Data in Emergent Distributed Environments10.1145/3530050.3532929(1-6)Online publication date: 12-Jun-2022
  • (2022)Wukong+G: Fast and Concurrent RDF Query Processing Using RDMA-Assisted GPU Graph ExplorationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.312156833:7(1619-1635)Online publication date: 1-Jul-2022
  • (2022)Space/time-efficient RDF stores based on circular suffix sortingThe Journal of Supercomputing10.1007/s11227-022-04890-w79:5(5643-5683)Online publication date: 25-Oct-2022
  • (2022)Optimization of Bit Matrix Index for Temporal RDFKnowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy10.1007/978-981-19-7596-7_8(95-107)Online publication date: 19-Nov-2022
  • (2022)Knowledge Graph Compression for Big Semantic DataEncyclopedia of Big Data Technologies10.1007/978-3-319-63962-8_62-2(1-13)Online publication date: 17-Mar-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EPUB

View this article in ePub.

ePub

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media