skip to main content
10.1145/2494266.2494277acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
research-article

Uncertain version control in open collaborative editing of tree-structured documents

Authors Info & Claims
Published:10 September 2013Publication History

ABSTRACT

In order to ease content enrichment, exchange, and sharing, web-scale collaborative platforms such as Wikipedia or Google Docs enable unbounded interactions between a large number of contributors, without prior knowledge of their level of expertise and reliability. Version control is then essential for keeping track of the evolution of the shared content and its provenance. In such environments, uncertainty is ubiquitous due to the unreliability of the sources, the incompleteness and imprecision of the contributions, the possibility of malicious editing and vandalism acts, etc. To handle this uncertainty, we use a probabilistic XML model as a basic component of our version control framework. Each version of a shared document is represented by an XML tree and the whole document, together with its different versions, is modeled as a probabilistic XML document. Uncertainty is evaluated using the probabilistic model and the reliability measure associated to each source, each contributor, or each editing event, resulting in an uncertainty measure on each version and each part of the document. We show that standard version control operations can be implemented directly as operations on the probabilistic XML model; efficiency with respect to deterministic version control systems is demonstrated on real-world datasets.

References

  1. Cassandra Project. http://cassandra.apache.org/.Google ScholarGoogle Scholar
  2. Google Drive. https://drive.google.com/.Google ScholarGoogle Scholar
  3. Java Git. http://www.eclipse.org/jgit/.Google ScholarGoogle Scholar
  4. Linux Kernel. https://www.kernel.org/.Google ScholarGoogle Scholar
  5. {Sub}Versioning for Java. http://svnkit.com/.Google ScholarGoogle Scholar
  6. Wikipedia Platform. http://www.wikipedia.org/.Google ScholarGoogle Scholar
  7. T. Abdessalem, M. L. Ba, and P. Senellart. A probabilistic XML merging tool. In EDBT, 2011. Demonstration. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Abdessalem and G. Jomier. VQL: A query language for multiversion databases. In DBPL, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Abiteboul, B. Kimelfeld, Y. Sagiv, and P. Senellart. On the expressiveness of probabilistic XML models. VLDB Journal, 18(5), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. T. Adler and L. de Alfaro. A content-driven reputation system for the Wikipedia. In WWW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Al-Khudair, W. A. Gray, and J. C. Miles. Dynamic evolution and consistency of collaborative configurations in object-oriented databases. In Proc. TOOLS, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Altmanninger, M. Seidl, and M. Wimmer. A survey on model versioning approaches. IJWIS, 5, 2009.Google ScholarGoogle Scholar
  13. M. L. Ba, T. Abdessalem, and P. Senellart. Towards a version control model with uncertain data. In PIKM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Cellary and G. Jomier. Consistency of versions in object-oriented databases. In VLDB, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Chacon. Git Book. http://book.git-scm.com/.Google ScholarGoogle Scholar
  16. G. Cobéna and T. Abdessalem. A comparative study of XML change detection algorithms. In Services and Business Computing Solutions with XML: Applications for Quality Management and Best Processes. IGI Global, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  17. G. Cobéna, S. Abiteboul, and A. Marian. Detecting Changes in XML Documents. In ICDE, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  18. B. Collins-Sussman, B. W. Fitzpatrick, and C. M. Pilato. Version Control with Subversion. O'Reilly Media, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Conradi and B. Westfechtel. Towards a uniform version model for software configuration management. In System Configuration Management, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. de la Calzada and A. Dekhtyar. On measuring the quality of Wikipedia articles. In WICOW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Khan, L. Wang, and Y. Rao. Change detection of XML documents using signatures. In Real World RDF and Semantic Web Applications, 2002.Google ScholarGoogle Scholar
  22. E. Kharlamov, W. Nutt, and P. Senellart. Updating Probabilistic XML. In Updates in XML, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Kimelfeld, Y. Kosharovsky, and Y. Sagiv. Query evaluation over probabilistic XML. VLDB Journal, 18(5), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Kimelfeld and Y. Sagiv. Modeling and querying probabilistic XML data. SIGMOD Rec., 37(4), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Kimelfeld and P. Senellart. Probabilistic XML\string: Models and complexity. In Z. Ma and L. Yan, editors, Advances in Probabilistic Databases for Uncertain Information Management. Springer-Verlag, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  26. A. Koc and A. U. Tansel. A survey of version control systems. In ICEME, 2011.Google ScholarGoogle Scholar
  27. T. Lindholm, J. Kangasharju, and S. Tarkoma. Fast and simple XML tree differencing by sequence alignment. In DocEng, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Magnani and D. Montesi. A survey on uncertainty management in data integration. J. Data and Information Quality, 2, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Maniu, B. Cautis, and T. Abdessalem. Building a signed network from interactions in Wikipedia. In DBSocial, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Nierman and H. V. Jagadish. ProTDB: probabilistic data in XML. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Rönnau and U. Borghoff. Versioning XML-based office documents. Multimedia Tools and Applications, 43, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Rönnau and U. Borghoff. XCC: change control of XML documents. CSRD, 2010.Google ScholarGoogle Scholar
  33. L. I. Rusu, W. Rahayu, and D. Taniar. Maintaining versions of dynamic XML documents. In WISE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Sabel. Structuring wiki revision history. In WikiSym, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. Thao and E. V. Munson. Version-aware XML documents. In DocEng, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. van Keulen and A. de Keijzer. Qualitative effects of knowledge rules and user feedback in probabilistic data integration. VLDB Journal, 18, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Van Keulen, A. de Keijzer, and W. Alink. A Probabilistic XML Approach to Data Integration. In ICDE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Voss. Measuring Wikipedia. In ISSI, 2005.Google ScholarGoogle Scholar
  39. Y. Wang, D. J. DeWitt, and J.-Y. Cai. X-Diff: An Effective Change Detection Algorithm for XML Documents. In ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Uncertain version control in open collaborative editing of tree-structured documents

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        DocEng '13: Proceedings of the 2013 ACM symposium on Document engineering
        September 2013
        582 pages
        ISBN:9781450317894
        DOI:10.1145/2494266

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 September 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        DocEng '13 Paper Acceptance Rate16of50submissions,32%Overall Acceptance Rate178of537submissions,33%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader