short-paper

Structurally Tractable Uncertain Data

Author:
Antoine Amarilli

Institut Mines-Télécom; Télécom ParisTech; CNRS LTCI, Paris, France

Institut Mines-Télécom; Télécom ParisTech; CNRS LTCI, Paris, France
View Profile

SIGMOD '15 PhD Symposium: Proceedings of the 2015 ACM SIGMOD on PhD SymposiumMay 2015Pages 39–44https://doi.org/10.1145/2744680.2744690

Published:31 May 2015Publication History

SIGMOD '15 PhD Symposium: Proceedings of the 2015 ACM SIGMOD on PhD Symposium

Pages 39–44

ABSTRACT

Many data management applications must deal with data which is uncertain, incomplete, or noisy. However, on existing uncertain data representations, we cannot tractably perform the important query evaluation tasks of determining query possibility, certainty, or probability: these problems are hard on arbitrary uncertain input instances. We thus ask whether we could restrict the structure of uncertain data so as to guarantee the tractability of exact query evaluation. We present our tractability results for tree and tree-like uncertain data, and a vision for probabilistic rule reasoning. We also study uncertainty about order, proposing a suitable representation, and study uncertain data conditioned by additional observations.

References

S. Abiteboul, D. Deutch, and V. Vianu. Deduction with contradictions in Datalog. In ICDT, 2014.Google Scholar
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, 1994. Google ScholarDigital Library
A. Amarilli, Y. Amsterdamer, and T. Milo. On the complexity of mining itemsets from the crowd using taxonomies. In Proc. ICDT, 2014.Google Scholar
A. Amarilli, Y. Amsterdamer, and T. Milo. Uncertainty in crowd data sourcing under structural constraints. In Proc. UnCrowd, 2014.Google ScholarCross Ref
A. Amarilli, L. M. Ba, D. Deutch, and P. Senellart. Querying order-incomplete data. Preprint: http://a3nm.net/publications/amarilli2015querying.pdf.Google Scholar
A. Amarilli, P. Bourhis, and P. Senellart. Probabilities and provenance via tree decompositions. Preprint: http://a3nm.net/publications/amarilli2015probabilities.pdf, 2014.Google Scholar
A. Amarilli and P. Senellart. Unsaid string: Uncertainty and structure in the access to intensional data. Preprint: http://a3nm.net/publications/amarilli2014unsaid.pdf.Google Scholar
Y. Amsterdamer, Y. Grossman, T. Milo, and P. Senellart. Crowd mining. In SIGMOD, 2013. Google ScholarDigital Library
M. L. Ba, T. Abdessalem, and P. Senellart. Uncertain version control in open collaborative editing of tree-structured documents. In DocEng, 2013. Google ScholarDigital Library
J.-F. Baget, M. Leclère, M.-L. Mugnier, and E. Salvat. On rules with existential variables: Walking the decidability line. Artificial Intelligence, 175(9), 2011. Google ScholarDigital Library
V. Bárány, B. ten Cate, B. Kimelfeld, D. Olteanu, and Z. Vagena. Declarative statistical modeling with Datalog. CoRR, abs/1412.2221, 2015.Google Scholar
M. Benedikt, E. Kharlamov, D. Olteanu, and P. Senellart. Probabilistic XML via Markov chains. PVLDB, 3(1--2), 2010. Google ScholarDigital Library
G. Brightwell and P. Winkler. Counting linear extensions. Order, 8(3), 1991.Google Scholar
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. H. Jr., and T. M. Mitchell. Toward an architecture for never-ending language learning. In AAAI, 2010.Google ScholarDigital Library
S. Cohen, B. Kimelfeld, and Y. Sagiv. Incorporating constraints in probabilistic XML. TODS, 34(3), 2009. Google ScholarDigital Library
S. Cohen, B. Kimelfeld, and Y. Sagiv. Running tree automata on probabilistic XML. In PODS, 2009. Google ScholarDigital Library
B. Courcelle. Graph rewriting: An algebraic and logic approach. In Handbook of Theoretical Computer Science. Elsevier, 1990. Google ScholarDigital Library
N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. VLDBJ, 16(4), 2007. Google ScholarDigital Library
N. Dalvi and D. Suciu. The dichotomy of probabilistic inference for unions of conjunctive queries. JACM, 59(6), 2012. Google ScholarDigital Library
D. Deutch, T. Milo, S. Roy, and V. Tannen. Circuits for Datalog provenance. In ICDT, 2014.Google Scholar
X. Dong, A. Y. Halevy, and C. Yu. Data integration with uncertainty. In VLDB, 2007. Google ScholarDigital Library
L. Galárraga, C. Teflioudi, K. Hose, and F. M. Suchanek. AMIE string: Association rule mining under incomplete evidence in ontological knowledge bases. In WWW, 2013. Google ScholarDigital Library
G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions: A survey. In MFCS. 2001. Google ScholarDigital Library
G. Gottlob, T. Lukasiewicz, and G. I. Simari. Conjunctive query answering in probabilistic Datalog+- ontologies. In Web Reasoning and Rule Systems. Springer, 2011. Google ScholarDigital Library
G. Gottlob, R. Pichler, and F. Wei. Monadic Datalog over finite structures of bounded treewidth. TOCL, 12(1), 2010. Google ScholarDigital Library
E. Grädel. Efficient evaluation methods for guarded logics and Datalog LITE. In LPAR, 2000.Google ScholarCross Ref
T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, 2007. Google ScholarDigital Library
T. J. Green and V. Tannen. Models for incomplete and probabilistic information. In IIDB, 2006. Google ScholarDigital Library
J. Henriksen, J. Jensen, M. Jørgensen, N. Klarlund, B. Paige, T. Rauhe, and A. Sandholm. Mona: Monadic second-order logic in practice. In TACAS, 1995. Google ScholarDigital Library
J. Huang, L. Antova, C. Koch, and D. Olteanu. MayBMS: a probabilistic database management system. In SIGMOD, 2009. Google ScholarDigital Library
T. Imielinski and W. Lipski, Jr. Incomplete information in relational databases. JACM, 31(4), 1984. Google ScholarDigital Library
A. Jha and D. Suciu. Probabilistic databases with MarkoViews. PVLDB, 5(11), 2012. Google ScholarDigital Library
B. Kimelfeld, Y. Kosharovsky, and Y. Sagiv. Query efficiency in probabilistic XML models. In SIGMOD, 2008. Google ScholarDigital Library
B. Kimelfeld and P. Senellart. Probabilistic XML\string: Models and complexity. In Z. Ma and L. Yan, editors, Advances in Probabilistic Databases for Uncertain Information Management. Springer, 2013.Google ScholarCross Ref
L. V. S. Lakshmanan, N. Leone, R. B. Ross, and V. S. Subrahmanian. ProbView: A flexible probabilistic database system. TODS, 22(3), 1997. Google ScholarDigital Library
S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. J. Roy. Stat. Soc., Ser. B, 1988.Google Scholar
S. Maniu, R. Cheng, and P. Senellart. ProbTree\string: A query-efficient representation of probabilistic graphs. In BUDA, June 2014. Workshop without formal proceedings.Google Scholar
A. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, A. Ramesh, and J. Widom. Crowdscreen: Algorithms for filtering data with humans. In SIGMOD, 2012. Google ScholarDigital Library
L. D. Raedt, A. Kimmig, and H. Toivonen. Problog: A probabilistic Prolog and its application in link discovery. In IJCAI, 2007. Google ScholarDigital Library
M. Richardson and P. Domingos. Markov logic networks. Machine learning, 62(1--2), 2006. Google ScholarDigital Library
N. Robertson and P. D. Seymour. Graph minors. II. Algorithmic aspects of tree-width. J. Algorithms, 7(3), 1986.Google Scholar
K. Stefanidis, G. Koutrika, and E. Pitoura. A survey on representation, composition and application of preferences in database systems. ACM Transactions on Database Systems (TODS), 36(3), 2011. Google ScholarDigital Library
R. Tang, R. Cheng, H. Wu, and S. Bressan. A framework for conditioning uncertain relational data. In Database and Expert Systems Applications. Springer, 2012.Google ScholarCross Ref
J. W. Thatcher and J. B. Wright. Generalized finite automata theory with an application to a decision problem of second-order logic. Math. systems theory, 2(1), 1968.Google Scholar
D. Vrandečić and M. Krötzsch. Wikidata: a free collaborative knowledgebase. CACM, 57(10), 2014. Google ScholarDigital Library

Index Terms

Structurally Tractable Uncertain Data
1. Information systems
  1. Data management systems
    1. Database design and models

Recommendations

Tractable Lineages on Treelike Instances: Limits and Extensions
PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

Query evaluation on probabilistic databases is generally intractable (#P-hard). Existing dichotomy results have identified which queries are tractable (or safe), and connected them to tractable lineages. In our previous work, using different tools, we ...
Read More
Finding Probabilistic k-Skyline Sets on Uncertain Data
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Skyline is a set of points that are not dominated by any other point. Given uncertain objects, probabilistic skyline has been studied which computes objects with high probability of being skyline. While useful for selecting individual objects, it is not ...
Read More
ELCA evaluation for keyword search on probabilistic XML data

As probabilistic data management is becoming one of the main research focuses and keyword search is turning into a more popular query means, it is natural to think how to support keyword queries on probabilistic XML data. With regards to keyword query ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '15 PhD Symposium: Proceedings of the 2015 ACM SIGMOD on PhD Symposium
May 2015
62 pages
ISBN:9781450335294
DOI:10.1145/2744680
Program Chairs:
Pablo Barcelo
University of Chile, Chile
,
Heng Tao Shen
The University of Queensland, Australia
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 May 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data
order
probabilistic
structure
tractability
treewidth
uncertain
Qualifiers
- short-paper
Conference

Acceptance Rates
SIGMOD '15 PhD Symposium Paper Acceptance Rate9of11submissions,82%Overall Acceptance Rate40of60submissions,67%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 101
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Structurally Tractable Uncertain Data

SIGMOD '15 PhD Symposium: Proceedings of the 2015 ACM SIGMOD on PhD Symposium

ABSTRACT

References

Cited By

Index Terms

Recommendations

Tractable Lineages on Treelike Instances: Limits and Extensions

Finding Probabilistic k-Skyline Sets on Uncertain Data

ELCA evaluation for keyword search on probabilistic XML data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Structurally Tractable Uncertain Data

SIGMOD '15 PhD Symposium: Proceedings of the 2015 ACM SIGMOD on PhD Symposium

ABSTRACT

References

Cited By

Index Terms

Recommendations

Tractable Lineages on Treelike Instances: Limits and Extensions

Finding Probabilistic k-Skyline Sets on Uncertain Data

ELCA evaluation for keyword search on probabilistic XML data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media