skip to main content
10.1145/3132218.3132220acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

SMJoin: A Multi-way Join Operator for SPARQL Queries

Published: 11 September 2017 Publication History

Abstract

Join operators are particularly important in SPARQL query engines that collect RDF data using Web access interfaces. State-of-the-art SPARQL query engines rely on binary join operators tailored for merging results from SPARQL queries over Web access interfaces. However, in queries with a large number of triple patterns, binary joins constitute a significant burden on the query performance. Multi-way joins that handle more than two inputs are able to reduce the complexity of pre-processing stages and reduce the execution time. Whereas in the relational databases field multi-way joins have already received some attention, the applicability of multi-way joins in SPARQL query processing remains unexplored. We devise SMJoin, a multi-way non-blocking join operator tailored for independently merging results from more than two RDF data sources. SMJoin implements intra-operator adaptivity, i.e., it is able to adjust join execution schedulers to the conditions of Web access interfaces; thus, query answers are produced as soon as they are computed and can be continuously generated even if one of the sources becomes blocked. We empirically study the behavior of SMJoin in two benchmarks with queries of different selectivity; state-of-the-art SPARQL query engines are included in the study. Experimental results suggest that SMJoin outperforms existing approaches in very selective queries, and produces first answers as fast as compared adaptive query engines in non-selective queries.

References

[1]
M. Acosta and M. Vidal. Networks of linked data eddies: An adaptive web query processing engine for RDF data. In 14th ISWC, USA, pages 111--127, 2015.
[2]
M. Acosta, M. Vidal, T. Lampo, J. Castillo, and E. Ruckhaus. ANAPSID: an adaptive query processing engine for SPARQL endpoints. In 10th ISWC, Germany, pages 18--34, 2011.
[3]
J. Ahn, D. Im, and H. Kim. Sigmr: Mapreduce-based SPARQL query processing by signature encoding and multi-way join. The Journal of Supercomputing, 71(10):3695--3725, 2015.
[4]
A. Deshpande, Z. G. Ives, and V. Raman. Adaptive query processing. Foundations and Trends in Databases, 1(1):1--140, 2007.
[5]
D. Florescu, A. Y. Levy, I. Manolescu, and D. Suciu. Query optimization in the presence of limited access patterns. In SIGMOD 1999, USA., pages 311--322, 1999.
[6]
H. Garcia-Molina, J. D. Ullman, and J. Widom. Database System Implementation. Prentice-Hall, 2000.
[7]
B. Gedik, K. Wu, P. S. Yu, and L.Liu. Grubjoin: An adaptive, multi-way, windowed stream join with time correlation-aware CPU load shedding. IEEE Trans. Knowl. Data Eng., 19(10):1363--1380, 2007.
[8]
O. Görlitz and S. Staab. Splendid: SPARQL endpoint federation exploiting void descriptions. In COLD, pages 13--24.CEUR-WS.org, 2011.
[9]
G. Ladwig and T. Tran. SIHJoin: Querying remote and local linked data. In 8th ESWC, Greece, pages 139--153, 2011.
[10]
D. L. Phuoc, H. N. M. Quoc, C. L. Van, and M. Hauswirth. Elastic and scalable processing of linked stream data in the cloud. In 12th ISWC, Australia, 2013, pages 280--297, 2013.
[11]
M. Saleem and A. N. Ngomo. Hibiscus: Hypergraph-based source selection for SPARQL endpoint federation. In The Semantic Web: Trends and Challenges - 11th International Conference, ESWC 2014, Anissaras, Crete, Greece, May 25-29, 2014. Proceedings, pages 176--191, 2014.
[12]
A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. Fedx: Optimization techniques for federated query processing on linked data. In 10th ISWC, Germany, pages 601--616, 2011.
[13]
T. Urhan and M.J. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Eng. Bull, 23(2):27--33, 2000.
[14]
R. Verborgh, M. V Sande, O. Hartig, J. V Herwegen, L. D. Vocht, B. D. Meester, G. Haesendonck, and P. Colpaert. Triple pattern fragments: A low-cost knowledge graph interface for the web. J. Web Sem., 37--38:184--206, 2016.
[15]
M. Vidal, E. Ruckhaus, T. Lampo, A. Martínez, J. Sierra, and A. Polleres. Efficiently joining group patterns in SPARQL queries. In ESWC 2010, pages 228--242, 2010.
[16]
S. Viglas, J. F. Naughton, and J. Burger. Maximizing the output rate of multi-way join queries over streaming information sources. In 29th VLDB, Germany, pages 285--296, 2003.
[17]
X. Zhang, L. Chen, and M. Wang. Efficient multi-way theta-join processing using mapreduce. PVLDB, 5(11):1184--1195, 2012.

Cited By

View all
  • (2021)SPARQL2Flink: Evaluation of SPARQL Queries on Apache FlinkApplied Sciences10.3390/app1115703311:15(7033)Online publication date: 30-Jul-2021
  • (2021)VEDAS: an efficient GPU alternative for store and query of large RDF data setsJournal of Big Data10.1186/s40537-021-00513-y8:1Online publication date: 16-Sep-2021
  • (2021)A survey of RDF stores & SPARQL engines for querying knowledge graphsThe VLDB Journal10.1007/s00778-021-00711-331:3(1-26)Online publication date: 13-Nov-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
Semantics2017: Proceedings of the 13th International Conference on Semantic Systems
September 2017
202 pages
ISBN:9781450352963
DOI:10.1145/3132218
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • St. Pölten University: St. Pölten University of Applied Sciences, Austria
  • Wolters Kluwer: Wolters Kluwer, Germany
  • Vrije Universeit Amsterdam: Vrije Universeit Amsterdam
  • Semantic Web Company: Semantic Web Company
  • Uinv. Leipzig: Universität Leipzig

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Join algorithms
  2. Multi-way Join Operators
  3. SPARQL

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

Semantics2017

Acceptance Rates

Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)SPARQL2Flink: Evaluation of SPARQL Queries on Apache FlinkApplied Sciences10.3390/app1115703311:15(7033)Online publication date: 30-Jul-2021
  • (2021)VEDAS: an efficient GPU alternative for store and query of large RDF data setsJournal of Big Data10.1186/s40537-021-00513-y8:1Online publication date: 16-Sep-2021
  • (2021)A survey of RDF stores & SPARQL engines for querying knowledge graphsThe VLDB Journal10.1007/s00778-021-00711-331:3(1-26)Online publication date: 13-Nov-2021
  • (2020)Scalable Multiway Stream Joins in HardwareIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.291686032:12(2438-2452)Online publication date: 1-Dec-2020
  • (2019)A Worst-Case Optimal Join Algorithm for SPARQLThe Semantic Web – ISWC 201910.1007/978-3-030-30793-6_15(258-275)Online publication date: 17-Oct-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media