skip to main content
10.1145/2744680.2744690acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper

Structurally Tractable Uncertain Data

Published:31 May 2015Publication History

ABSTRACT

Many data management applications must deal with data which is uncertain, incomplete, or noisy. However, on existing uncertain data representations, we cannot tractably perform the important query evaluation tasks of determining query possibility, certainty, or probability: these problems are hard on arbitrary uncertain input instances. We thus ask whether we could restrict the structure of uncertain data so as to guarantee the tractability of exact query evaluation. We present our tractability results for tree and tree-like uncertain data, and a vision for probabilistic rule reasoning. We also study uncertainty about order, proposing a suitable representation, and study uncertain data conditioned by additional observations.

References

  1. S. Abiteboul, D. Deutch, and V. Vianu. Deduction with contradictions in Datalog. In ICDT, 2014.Google ScholarGoogle Scholar
  2. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Amarilli, Y. Amsterdamer, and T. Milo. On the complexity of mining itemsets from the crowd using taxonomies. In Proc. ICDT, 2014.Google ScholarGoogle Scholar
  5. A. Amarilli, Y. Amsterdamer, and T. Milo. Uncertainty in crowd data sourcing under structural constraints. In Proc. UnCrowd, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. Amarilli, L. M. Ba, D. Deutch, and P. Senellart. Querying order-incomplete data. Preprint: http://a3nm.net/publications/amarilli2015querying.pdf.Google ScholarGoogle Scholar
  7. A. Amarilli, P. Bourhis, and P. Senellart. Probabilities and provenance via tree decompositions. Preprint: http://a3nm.net/publications/amarilli2015probabilities.pdf, 2014.Google ScholarGoogle Scholar
  8. A. Amarilli and P. Senellart. Unsaid string: Uncertainty and structure in the access to intensional data. Preprint: http://a3nm.net/publications/amarilli2014unsaid.pdf.Google ScholarGoogle Scholar
  9. Y. Amsterdamer, Y. Grossman, T. Milo, and P. Senellart. Crowd mining. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. L. Ba, T. Abdessalem, and P. Senellart. Uncertain version control in open collaborative editing of tree-structured documents. In DocEng, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J.-F. Baget, M. Leclère, M.-L. Mugnier, and E. Salvat. On rules with existential variables: Walking the decidability line. Artificial Intelligence, 175(9), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. Bárány, B. ten Cate, B. Kimelfeld, D. Olteanu, and Z. Vagena. Declarative statistical modeling with Datalog. CoRR, abs/1412.2221, 2015.Google ScholarGoogle Scholar
  13. M. Benedikt, E. Kharlamov, D. Olteanu, and P. Senellart. Probabilistic XML via Markov chains. PVLDB, 3(1--2), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Brightwell and P. Winkler. Counting linear extensions. Order, 8(3), 1991.Google ScholarGoogle Scholar
  15. A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. H. Jr., and T. M. Mitchell. Toward an architecture for never-ending language learning. In AAAI, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Cohen, B. Kimelfeld, and Y. Sagiv. Incorporating constraints in probabilistic XML. TODS, 34(3), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Cohen, B. Kimelfeld, and Y. Sagiv. Running tree automata on probabilistic XML. In PODS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Courcelle. Graph rewriting: An algebraic and logic approach. In Handbook of Theoretical Computer Science. Elsevier, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. VLDBJ, 16(4), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Dalvi and D. Suciu. The dichotomy of probabilistic inference for unions of conjunctive queries. JACM, 59(6), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Deutch, T. Milo, S. Roy, and V. Tannen. Circuits for Datalog provenance. In ICDT, 2014.Google ScholarGoogle Scholar
  22. X. Dong, A. Y. Halevy, and C. Yu. Data integration with uncertainty. In VLDB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Galárraga, C. Teflioudi, K. Hose, and F. M. Suchanek. AMIE string: Association rule mining under incomplete evidence in ontological knowledge bases. In WWW, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions: A survey. In MFCS. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. Gottlob, T. Lukasiewicz, and G. I. Simari. Conjunctive query answering in probabilistic Datalog+- ontologies. In Web Reasoning and Rule Systems. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Gottlob, R. Pichler, and F. Wei. Monadic Datalog over finite structures of bounded treewidth. TOCL, 12(1), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Grädel. Efficient evaluation methods for guarded logics and Datalog LITE. In LPAR, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  28. T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. J. Green and V. Tannen. Models for incomplete and probabilistic information. In IIDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Henriksen, J. Jensen, M. Jørgensen, N. Klarlund, B. Paige, T. Rauhe, and A. Sandholm. Mona: Monadic second-order logic in practice. In TACAS, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Huang, L. Antova, C. Koch, and D. Olteanu. MayBMS: a probabilistic database management system. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Imielinski and W. Lipski, Jr. Incomplete information in relational databases. JACM, 31(4), 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Jha and D. Suciu. Probabilistic databases with MarkoViews. PVLDB, 5(11), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Kimelfeld, Y. Kosharovsky, and Y. Sagiv. Query efficiency in probabilistic XML models. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. B. Kimelfeld and P. Senellart. Probabilistic XML\string: Models and complexity. In Z. Ma and L. Yan, editors, Advances in Probabilistic Databases for Uncertain Information Management. Springer, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  36. L. V. S. Lakshmanan, N. Leone, R. B. Ross, and V. S. Subrahmanian. ProbView: A flexible probabilistic database system. TODS, 22(3), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. J. Roy. Stat. Soc., Ser. B, 1988.Google ScholarGoogle Scholar
  38. S. Maniu, R. Cheng, and P. Senellart. ProbTree\string: A query-efficient representation of probabilistic graphs. In BUDA, June 2014. Workshop without formal proceedings.Google ScholarGoogle Scholar
  39. A. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, A. Ramesh, and J. Widom. Crowdscreen: Algorithms for filtering data with humans. In SIGMOD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. L. D. Raedt, A. Kimmig, and H. Toivonen. Problog: A probabilistic Prolog and its application in link discovery. In IJCAI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Richardson and P. Domingos. Markov logic networks. Machine learning, 62(1--2), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. N. Robertson and P. D. Seymour. Graph minors. II. Algorithmic aspects of tree-width. J. Algorithms, 7(3), 1986.Google ScholarGoogle Scholar
  43. K. Stefanidis, G. Koutrika, and E. Pitoura. A survey on representation, composition and application of preferences in database systems. ACM Transactions on Database Systems (TODS), 36(3), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. R. Tang, R. Cheng, H. Wu, and S. Bressan. A framework for conditioning uncertain relational data. In Database and Expert Systems Applications. Springer, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  45. J. W. Thatcher and J. B. Wright. Generalized finite automata theory with an application to a decision problem of second-order logic. Math. systems theory, 2(1), 1968.Google ScholarGoogle Scholar
  46. D. Vrandečić and M. Krötzsch. Wikidata: a free collaborative knowledgebase. CACM, 57(10), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Structurally Tractable Uncertain Data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '15 PhD Symposium: Proceedings of the 2015 ACM SIGMOD on PhD Symposium
      May 2015
      62 pages
      ISBN:9781450335294
      DOI:10.1145/2744680

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 31 May 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      SIGMOD '15 PhD Symposium Paper Acceptance Rate9of11submissions,82%Overall Acceptance Rate40of60submissions,67%
    • Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader