skip to main content
10.1145/1772690.1772733acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections

Data summaries for on-demand queries over linked data

Published: 26 April 2010 Publication History


Typical approaches for querying structured Web Data collect (crawl) and pre-process (index) large amounts of data in a central data repository before allowing for query answering. However, this time-consuming pre-processing phase however leverages the benefits of Linked Data -- where structured data is accessible live and up-to-date at distributed Web resources that may change constantly -- only to a limited degree, as query results can never be current. An ideal query answering system for Linked Data should return current answers in a reasonable amount of time, even on corpora as large as the Web. Query processors evaluating queries directly on the live sources require knowledge of the contents of data sources. In this paper, we develop and evaluate an approximate index structure summarising graph-structured content of sources adhering to Linked Data principles, provide an algorithm for answering conjunctive queries over Linked Data on theWeb exploiting the source summary, and evaluate the system using synthetically generated queries. The experimental results show that our lightweight index structure enables complete and up-to-date query results over Linked Data, while keeping the overhead for querying low and providing a satisfying source ranking at no additional cost.


T. Berners-Lee. Linked data, July 2006.
D. Brickley, L. Miller. FOAF Vocabulary Spec. 0.91, 2007.
G. Cheng, Y. Qu. Searching linked objects with falcons: Approach, implementation and evaluation. JSWIS, 5(3):49--70, 2009.
K. G. Clark, L. Feigenbaum, E. Torres. SPARQL protocol for RDF, Jan. 2008. W3C Rec.,
A. Crespo, H. Garcia-Molina. Routing indices for peer-to-peer systems. ICDCS '02, p.23--32, 2002.
M. d'Aquin, C. Baldassarre, L. Gridinoc, S. Angeletou, M. Sabou, E. Motta. Characterizing knowledge on the semantic web with watson. EON'07, p.1--10, 2007.
R. Goldman, J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. VLDB'97, p.436--445, 1997.
A. Guttman. R-Trees: A Dynamic Index Structure for Spatial Searching. SIGMOD '84, p.47--57, 1984.
A. Harth, S. Decker. Optimized index structures for querying RDF from the web. 3rd Latin American Web Congress, p.71--80, 2005.
O. Hartig, C. Bizer, J.-C. Freytag. Executing sparql queries over the web of linked data. ISWC'09, 2009.
D. Heimbigner, D. McLeod. A federated architecture for information management. ACM Trans. Inf. Syst., 3(3):253--278, 1985.
M. R. Henzinger, A. Heydon, M. Mitzenmacher, M. Najork. Measuring index quality using random walks on the web. Computer Networks, 31(11-16):1291--1303, 1999.
A. Hogan, A. Harth, J. Umbrich, S. Decker. Towards a scalable search and query engine for the web. WWW'07, p.1301--1302, 2007.
K. Hose, M. Karnstedt, A. Koch, K. Sattler, D. Zinn. Processing Rank-Aware Queries in P2P Systems. DBISP2P'05, p.238--249, 2005.
K. Hose, D. Klan, K. Sattler. Distributed Data Summaries for Approximate Query Processing in PDMS. IDEAS '06, p.37--44, 2006.
Y. Ioannidis. The History of Histograms (abridged). VLDB '03, p.19--30, 2003.
D. Kossmann. The state of the art in distributed query processing. ACM Computing Surveys, 32(4):422--469, Dec. 2000.
A. Langegger, W. Woß. RDFstats - an extensible RDF statistics generator and library. 8th Int'l Workshop on Web Semantics, DEXA, 2009.
M. Marzolla, M. Mordacchini, S. Orlando. Tree Vector Indexes: Efficient Range Queries for Dynamic Content on Peer-to-Peer Networks. PDP'06, p.457--464, 2006.
T. Neumann, G. Weikum. RDF-3X: a RISC-style Engine for RDF. VLDB Endow., 1(1):647--659, 2008.
E. Oren, R. Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, G. Tummarello. A document-oriented lookup index for open linked data. JMSO, 3(1), 2008.
Y. Petrakis, G. Koloniari, E. Pitoura. On Using Histograms as Routing Indexes in Peer-to-Peer Systems. DBISP2P '04, p.16--30, 2004.
Y. Petrakis and E. Pitoura. On Constructing Small Worlds in Unstructured Peer-to-Peer Systems. EDBT Workshops, p.415--424, 2004.
E. Prud'hommeaux and A. Seaborne. SPARQL query language for RDF, Jan. 2008. W3C Rec.,
B. Quilitz and U. Leser. Querying distributed RDF data sources with SPARQL. ESWC'08, p.524--538, Tenerife, Spain, 2008.
H. Stuckenschmidt, R. Vdovjak, G.-J. Houben, J. Broekstra. Index structures and algorithms for querying distributed RDF repositories. WWW'04, p.631--639, 2004.
D. Zinn. Skyline Queries in P2P Systems. Diploma Thesis, TU Ilmenau, 2004.

Cited By

View all
  • (2024)A systematic overview of data federation systemsSemantic Web10.3233/SW-22320115:1(107-165)Online publication date: 12-Jan-2024
  • (2024)Sharing Linked Building Data in a Peer-to-Peer Network: ifcOWL Meets InterPlanetary File SystemJournal of Computing in Civil Engineering10.1061/JCCEE5.CPENG-538138:1Online publication date: Jan-2024
  • (2024)Answering Property Path Queries over Federated RDF SystemsWeb and Big Data10.1007/978-981-97-2387-4_2(16-31)Online publication date: 28-Apr-2024
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Other conferences
WWW '10: Proceedings of the 19th international conference on World wide web
April 2010
1407 pages


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2010


Request permissions for this article.

Check for updates

Author Tags

  1. index structures
  2. linked data
  3. rdf querying


  • Research-article


WWW '10
WWW '10: The 19th International World Wide Web Conference
April 26 - 30, 2010
North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics


Cited By

View all
  • (2024)A systematic overview of data federation systemsSemantic Web10.3233/SW-22320115:1(107-165)Online publication date: 12-Jan-2024
  • (2024)Sharing Linked Building Data in a Peer-to-Peer Network: ifcOWL Meets InterPlanetary File SystemJournal of Computing in Civil Engineering10.1061/JCCEE5.CPENG-538138:1Online publication date: Jan-2024
  • (2024)Answering Property Path Queries over Federated RDF SystemsWeb and Big Data10.1007/978-981-97-2387-4_2(16-31)Online publication date: 28-Apr-2024
  • (2023)Optimizing SPARQL queries over decentralized knowledge graphsSemantic Web10.3233/SW-23343814:6(1121-1165)Online publication date: 13-Dec-2023
  • (2023)Optimizing Keyword Search Over Federated RDF SystemsIEEE Transactions on Big Data10.1109/TBDATA.2022.32247499:3(918-935)Online publication date: 1-Jun-2023
  • (2023)A Cost-Driven Top-K Queries Optimization Approach on Federated RDF SystemsIEEE Transactions on Big Data10.1109/TBDATA.2022.31560909:2(665-676)Online publication date: 1-Apr-2023
  • (2023)IndeGxWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2023.10077576:COnline publication date: 1-Apr-2023
  • (2023)Link Traversal Query Processing Over Decentralized Environments with Structural AssumptionsThe Semantic Web – ISWC 202310.1007/978-3-031-47240-4_1(3-22)Online publication date: 27-Oct-2023
  • (2022)Tab2KGSemantic Web10.3233/SW-22299313:3(571-597)Online publication date: 1-Jan-2022
  • (2022)A Computational Framework for Organizing and Querying Cultural Heritage ArchivesJournal on Computing and Cultural Heritage 10.1145/348584315:3(1-25)Online publication date: 16-Sep-2022
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.



View this article in ePub.







Share this Publication link

Share on social media