research-article

SMJoin: A Multi-way Join Operator for SPARQL Queries

Authors:

Mikhail Galkin,

Kemele M. Endris,

Maribel Acosta,

Diego Collarana,

Maria-Esther Vidal,

Sören AuerAuthors Info & Claims

Semantics2017: Proceedings of the 13th International Conference on Semantic Systems

Pages 104 - 111

https://doi.org/10.1145/3132218.3132220

Published: 11 September 2017 Publication History

Abstract

Join operators are particularly important in SPARQL query engines that collect RDF data using Web access interfaces. State-of-the-art SPARQL query engines rely on binary join operators tailored for merging results from SPARQL queries over Web access interfaces. However, in queries with a large number of triple patterns, binary joins constitute a significant burden on the query performance. Multi-way joins that handle more than two inputs are able to reduce the complexity of pre-processing stages and reduce the execution time. Whereas in the relational databases field multi-way joins have already received some attention, the applicability of multi-way joins in SPARQL query processing remains unexplored. We devise SMJoin, a multi-way non-blocking join operator tailored for independently merging results from more than two RDF data sources. SMJoin implements intra-operator adaptivity, i.e., it is able to adjust join execution schedulers to the conditions of Web access interfaces; thus, query answers are produced as soon as they are computed and can be continuously generated even if one of the sources becomes blocked. We empirically study the behavior of SMJoin in two benchmarks with queries of different selectivity; state-of-the-art SPARQL query engines are included in the study. Experimental results suggest that SMJoin outperforms existing approaches in very selective queries, and produces first answers as fast as compared adaptive query engines in non-selective queries.

References

[1]

M. Acosta and M. Vidal. Networks of linked data eddies: An adaptive web query processing engine for RDF data. In 14th ISWC, USA, pages 111--127, 2015.

Digital Library

[2]

M. Acosta, M. Vidal, T. Lampo, J. Castillo, and E. Ruckhaus. ANAPSID: an adaptive query processing engine for SPARQL endpoints. In 10th ISWC, Germany, pages 18--34, 2011.

Digital Library

[3]

J. Ahn, D. Im, and H. Kim. Sigmr: Mapreduce-based SPARQL query processing by signature encoding and multi-way join. The Journal of Supercomputing, 71(10):3695--3725, 2015.

Digital Library

[4]

A. Deshpande, Z. G. Ives, and V. Raman. Adaptive query processing. Foundations and Trends in Databases, 1(1):1--140, 2007.

Digital Library

[5]

D. Florescu, A. Y. Levy, I. Manolescu, and D. Suciu. Query optimization in the presence of limited access patterns. In SIGMOD 1999, USA., pages 311--322, 1999.

Digital Library

[6]

H. Garcia-Molina, J. D. Ullman, and J. Widom. Database System Implementation. Prentice-Hall, 2000.

Digital Library

[7]

B. Gedik, K. Wu, P. S. Yu, and L.Liu. Grubjoin: An adaptive, multi-way, windowed stream join with time correlation-aware CPU load shedding. IEEE Trans. Knowl. Data Eng., 19(10):1363--1380, 2007.

Digital Library

[8]

O. Görlitz and S. Staab. Splendid: SPARQL endpoint federation exploiting void descriptions. In COLD, pages 13--24.CEUR-WS.org, 2011.

Digital Library

[9]

G. Ladwig and T. Tran. SIHJoin: Querying remote and local linked data. In 8th ESWC, Greece, pages 139--153, 2011.

Digital Library

[10]

D. L. Phuoc, H. N. M. Quoc, C. L. Van, and M. Hauswirth. Elastic and scalable processing of linked stream data in the cloud. In 12th ISWC, Australia, 2013, pages 280--297, 2013.

Digital Library

[11]

M. Saleem and A. N. Ngomo. Hibiscus: Hypergraph-based source selection for SPARQL endpoint federation. In The Semantic Web: Trends and Challenges - 11th International Conference, ESWC 2014, Anissaras, Crete, Greece, May 25-29, 2014. Proceedings, pages 176--191, 2014.

[12]

A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. Fedx: Optimization techniques for federated query processing on linked data. In 10th ISWC, Germany, pages 601--616, 2011.

Digital Library

[13]

T. Urhan and M.J. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Eng. Bull, 23(2):27--33, 2000.

[14]

R. Verborgh, M. V Sande, O. Hartig, J. V Herwegen, L. D. Vocht, B. D. Meester, G. Haesendonck, and P. Colpaert. Triple pattern fragments: A low-cost knowledge graph interface for the web. J. Web Sem., 37--38:184--206, 2016.

Digital Library

[15]

M. Vidal, E. Ruckhaus, T. Lampo, A. Martínez, J. Sierra, and A. Polleres. Efficiently joining group patterns in SPARQL queries. In ESWC 2010, pages 228--242, 2010.

Digital Library

[16]

S. Viglas, J. F. Naughton, and J. Burger. Maximizing the output rate of multi-way join queries over streaming information sources. In 29th VLDB, Germany, pages 285--296, 2003.

Digital Library

[17]

X. Zhang, L. Chen, and M. Wang. Efficient multi-way theta-join processing using mapreduce. PVLDB, 5(11):1184--1195, 2012.

Digital Library

Cited By

Ceballos ORamírez Restrepo CPabón MCastillo ACorcho O(2021)SPARQL2Flink: Evaluation of SPARQL Queries on Apache FlinkApplied Sciences10.3390/app1115703311:15(7033)Online publication date: 30-Jul-2021
https://doi.org/10.3390/app11157033
Makpaisit PChantrapornchai C(2021)VEDAS: an efficient GPU alternative for store and query of large RDF data setsJournal of Big Data10.1186/s40537-021-00513-y8:1Online publication date: 16-Sep-2021
https://doi.org/10.1186/s40537-021-00513-y
Ali WSaleem MYao BHogan ANgomo A(2021)A survey of RDF stores & SPARQL engines for querying knowledge graphsThe VLDB Journal10.1007/s00778-021-00711-331:3(1-26)Online publication date: 13-Nov-2021
https://doi.org/10.1007/s00778-021-00711-3
Show More Cited By

Index Terms

SMJoin: A Multi-way Join Operator for SPARQL Queries
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
        Join algorithms

Recommendations

On optimizing relational self-joins
EDBT '12: Proceedings of the 15th International Conference on Extending Database Technology

Self-join, which joins a relation with itself, is a prevalent operation in relational database systems. Despite its wide applicability, there has been little attention devoted to improving its performance. In this paper, we present SCALE (<u>S</u>ort ...
gTop: An Efficient SPARQL Query Engine
Web and Big Data
Abstract
In this demonstration, we present gTop, a top-k query engine based on gStore which supports SPARQL queries over RDF databases. gTop can answer top-k queries with high efficiency and scalability. We use the DP-B algorithm for top-k queries and the ...
Compressed Representations of Conjunctive Query Results
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

Semantics2017: Proceedings of the 13th International Conference on Semantic Systems

September 2017

202 pages

ISBN:9781450352963

DOI:10.1145/3132218

Editors:
Rinke Hoekstra
Elsevier B.V., Amsterdam, The Netherlands
,
Catherine Faron-Zucker
University of Nice Sophia Antipolis, France
,
Tassilo Pellegrini
University of Applied Sciences St. Poelten, Austria
,
Victor de Boer
Vrije Universiteit Amsterdam, The Netherlands

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

St. Pölten University: St. Pölten University of Applied Sciences, Austria
Wolters Kluwer: Wolters Kluwer, Germany
Vrije Universeit Amsterdam: Vrije Universeit Amsterdam
Semantic Web Company: Semantic Web Company
Uinv. Leipzig: Universität Leipzig

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

Semantics2017

Semantics2017: Semantics 2017 - 13th International Conference on Semantic Systems

September 11 - 14, 2017

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
135
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ceballos ORamírez Restrepo CPabón MCastillo ACorcho O(2021)SPARQL2Flink: Evaluation of SPARQL Queries on Apache FlinkApplied Sciences10.3390/app1115703311:15(7033)Online publication date: 30-Jul-2021
https://doi.org/10.3390/app11157033
Makpaisit PChantrapornchai C(2021)VEDAS: an efficient GPU alternative for store and query of large RDF data setsJournal of Big Data10.1186/s40537-021-00513-y8:1Online publication date: 16-Sep-2021
https://doi.org/10.1186/s40537-021-00513-y
Ali WSaleem MYao BHogan ANgomo A(2021)A survey of RDF stores & SPARQL engines for querying knowledge graphsThe VLDB Journal10.1007/s00778-021-00711-331:3(1-26)Online publication date: 13-Nov-2021
https://doi.org/10.1007/s00778-021-00711-3
Najafi MSadoghi MJacobsen H(2020)Scalable Multiway Stream Joins in HardwareIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.291686032:12(2438-2452)Online publication date: 1-Dec-2020
https://doi.org/10.1109/TKDE.2019.2916860
Hogan ARiveros CRojas CSoto A(2019)A Worst-Case Optimal Join Algorithm for SPARQLThe Semantic Web – ISWC 201910.1007/978-3-030-30793-6_15(258-275)Online publication date: 17-Oct-2019
https://doi.org/10.1007/978-3-030-30793-6_15

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten