skip to main content
10.1145/2882903.2915221acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open access

Generating Preview Tables for Entity Graphs

Published: 26 June 2016 Publication History

Editorial Notes

Computationally Replicable. The experimental results of this paper were replicated by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here

Abstract

Users are tapping into massive, heterogeneous entity graphs for many applications. It is challenging to select entity graphs for a particular need, given abundant datasets from many sources and the oftentimes scarce information for them. We propose methods to produce preview tables for compact presentation of important entity types and relationships in entity graphs. The preview tables assist users in attaining a quick and rough preview of the data. They can be shown in a limited display space for a user to browse and explore, before she decides to spend time and resources to fetch and investigate the complete dataset. We formulate several optimization problems that look for previews with the highest scores according to intuitive goodness measures, under various constraints on preview size and distance between preview tables. The optimization problem under distance constraint is NP-hard. We design a dynamic-programming algorithm and an Apriori-style algorithm for finding optimal previews. Results from experiments, comparison with related work and user studies demonstrated the scoring measures' accuracy and the discovery algorithms' efficiency.

Supplementary Material

ReadMe (readme.pdf)
Rights information
Reproducibility (tabview_reproducibility.zip)
Data, Experiments

References

[1]
R. Agarwal and R. Srikant. Fast algorithms for mining association rules. In VLDB, pages 487--499, 1994.
[2]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a Web of open data. In ISWC, pages 722--735, 2007.
[3]
A. Balmin, V. Hristidis, and Y. Papakonstantinou. Objectrank: Authority-based keyword search in databases. In VLDB, pages 564--575, 2004.
[4]
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250, 2008.
[5]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In WWW, pages 107--117, 1998.
[6]
C. Bron and J. Kerbosch. Algorithm 457: finding all cliques of an undirected graph. CACM, 16(9):575--577, Sept. 1973.
[7]
J. Cohen. Statistical Power Analysis for the Behavioral Sciences. Academic Press, 1988.
[8]
X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In KDD, pages 601--610, 2014.
[9]
Y. Huang, Z. Liu, and Y. Chen. Query biased snippet generation in xml search. In SIGMOD, pages 315--326, 2008.
[10]
M. Jayapandian and H. V. Jagadish. Automated creation of a forms-based database query interface. PVLDB, 1(1):695--709, Aug. 2008.
[11]
F. Kose, W. Weckwerth, T. Linke, and O. Fiehn. Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinformatics, 17(12):1198--1208, Dec. 2001.
[12]
T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, Mar. 2009.
[13]
C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, 2008.
[14]
A. Nandi and H. V. Jagadish. Qunits: queried units in database search. In CIDR, 2009.
[15]
S. E. Schaeffer. Survey: Graph clustering. Comput. Sci. Rev., 1(1):27--64, Aug. 2007.
[16]
F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In WWW, pages 697--706, 2007.
[17]
Y. Tian, R. A. Hankins, and J. M. Patel. Efficient aggregation for graph summarization. In SIGMOD, pages 567--580, 2008.
[18]
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481--492, 2012.
[19]
X. Yang, C. M. Procopiuc, and D. Srivastava. Summarizing relational databases. PVLDB, 2(1):634--645, 2009.
[20]
X. Yang, C. M. Procopiuc, and D. Srivastava. Summary graphs for relational database schemas. PVLDB, 4(11):899--910, 2011.
[21]
C. Yu and H. V. Jagadish. Schema summarization. In VLDB, pages 319--330, 2006.
[22]
N. Zhang, Y. Tian, and J. M. Patel. Discovery-driven graph summarization. In ICDE, pages 880--891, 2010.

Cited By

View all
  • (2023)Dataset Discovery and Exploration: A SurveyACM Computing Surveys10.1145/362652156:4(1-37)Online publication date: 9-Nov-2023
  • (2021)Efficient Graph Summarization using Weighted LSH at Billion-ScaleProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457331(2357-2365)Online publication date: 9-Jun-2021
  • (2019)MithraLabelProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357853(2893-2896)Online publication date: 3-Nov-2019
  • Show More Cited By

Index Terms

  1. Generating Preview Tables for Entity Graphs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
      June 2016
      2300 pages
      ISBN:9781450335317
      DOI:10.1145/2882903
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication Notes

      Badge change: Article originally badged under Version 1.0 guidelines https://www.acm.org/publications/policies/artifact-review-badging

      Publication History

      Published: 26 June 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      Author Tags

      1. data exploration
      2. entity graph
      3. knowledge graph
      4. schema summarization

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SIGMOD/PODS'16
      Sponsor:
      SIGMOD/PODS'16: International Conference on Management of Data
      June 26 - July 1, 2016
      California, San Francisco, USA

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)109
      • Downloads (Last 6 weeks)16
      Reflects downloads up to 08 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Dataset Discovery and Exploration: A SurveyACM Computing Surveys10.1145/362652156:4(1-37)Online publication date: 9-Nov-2023
      • (2021)Efficient Graph Summarization using Weighted LSH at Billion-ScaleProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457331(2357-2365)Online publication date: 9-Jun-2021
      • (2019)MithraLabelProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357853(2893-2896)Online publication date: 3-Nov-2019
      • (2019)Summarizing database schema based on graph partitionMultimedia Tools and Applications10.1007/s11042-018-6543-y78:8(10077-10096)Online publication date: 1-Apr-2019
      • (2018)Utility-driven graph summarizationProceedings of the VLDB Endowment10.14778/3297753.329775512:4(335-347)Online publication date: 1-Dec-2018
      • (2018)Automated Comparative Table Generation for Facilitating Human Intervention in Multi-Entity ResolutionThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210021(585-594)Online publication date: 27-Jun-2018
      • (2018)TableView: A Visual Interface for Generating Preview Tables of Entity Graphs2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00190(1617-1620)Online publication date: Apr-2018
      • (2016)HIEDSProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3061053.3061137(3705-3711)Online publication date: 9-Jul-2016

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media