|
ABSTRACT
XML is the de facto standard for data representation and exchange over the Web. Given the diversity of the information available in XML, it is very useful to annotate XML data with a wide variety of meta-data, such as quality and sensitivity. When querying such XML data, say using XPath, it is important to efficiently identify the data that meet specified constraints on the meta-data. For example, different users may be satisfied with different levels of quality guarantees, or may only have access to different parts of the XML data based on specified security policies. In this paper, we address the problem of efficiently identifying the XML elements along a location step in an XPath query, that satisfy meta-data range constraints, when the meta-data levels are specifically drawn from an ordered domain (e.g., accuracy in [0,1], recency using timestamps, multi-level security, etc.). More specifically, we develop a family of index structures, which we refer to as meta-data indexes, to address this problem. A meta-data index is easily instantiated using a multi-dimensional index structure, such as an R-tree, incorporating novel query and update algorithms. We show that the full meta-data index (FMI), based on associating each XML element with its meta-data level, has a very high update cost for modifying an element's meta-data level. We resolve this problem by designing the inheritance meta-data index (IMI), in which (i) actual meta-data levels are associated only with elements for which this value is explicitly specified, and (ii) inherited meta-data levels and inheritance source nodes are associated with non-leaf nodes of the index structure. We design efficient query (for all XPath axes) and update (of meta-data levels) algorithms for the IMI, and experimentally demonstrate the superiority of the IMI over the FMI using benchmark data sets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
A. Berglund, S. Boag, D. Chamberlin, M. F. Fernandez, M. Kay, J. Robie, and J. Simeon. XML path language (XPath) 2.0. W3C Working Draft. Available from http://www.w3.org/TR/xpath20/.
|
| |
3
|
|
| |
4
|
D. Bhagwat, L. Chiticariu, W. C. Tan, and G. Vijayvargiya. An annotation management system for relational databases. In Proc. of VLDB, 2004.
|
 |
5
|
|
| |
6
|
|
| |
7
|
S. Cho, S. Amer-Yahia, L.V.S. Lakshmanan, and D. Srivastava. Optimizing the secure evaluation of twig queries. In Proc. of VLDB, 2002.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
Lois M. L. Delcambre , David Maier , Shawn Bowers , Mathew Weaver , Longxing Deng , Paul Gorman , Joan Ash , Mary Lavelle , Jason Lyman, Bundles in Captivity: An Application of Superimposed Information, Proceedings of the 17th International Conference on Data Engineering, p.111-120, April 02-06, 2001
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
H. V. Jagadish , S. Al-Khalifa , A. Chapman , L. V. S. Lakshmanan , A. Nierman , S. Paparizos , J. M. Patel , D. Srivastava , N. Wiwatwattana , Y. Wu , C. Yu, TIMBER: A native XML database, The VLDB Journal — The International Journal on Very Large Data Bases, v.11 n.4, p.274-291, December 2002
[doi> 10.1007/s00778-002-0081-x]
|
 |
16
|
|
| |
17
|
|
| |
18
|
G. Mihaila, L. Raschid, and M.-E. Vidal. Querying "quality of data" metadata. In Proc. of IEEE META-DATA Conference, 1999.
|
 |
19
|
|
| |
20
|
K.V. Ravikanth, D. Agrawal, A. El-Abbadi, A.K. Singh, and T. Smith. Indexing hierarchical data. Technical Report, UCSB, CS-Tr-9514, 1995.
|
| |
21
|
|
| |
22
|
H. Wang, S. Park, W. Fan and P. S. Yu. ViST: A dynamic index method for querying XML data by tree structures. In Proc. of VLDB, 2003.
|
 |
23
|
|
| |
24
|
J. Widom. Trio: A system for integrated management of data, accuracy, and lineage. In Proc. of CIDR, 2005.
|
|