ACM Home Page
Please provide us with feedback. Feedback
Exploit sequencing to accelerate hot XML query pattern mining
Full text PdfPdf (186 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2006 ACM symposium on Applied computing table of contents
Dijon, France
SESSION: Data mining (DM) table of contents
Pages: 517 - 524  
Year of Publication: 2006
ISBN:1-59593-108-2
Authors
Jianhua Feng  Tsinghua University, Beijing, China
Qian Qian  Tsinghua University, Beijing, China
Jianyong Wang  Tsinghua University, Beijing, China
Lizhu Zhou  Tsinghua University, Beijing, China
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 76,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1141277.1141400
What is a DOI?

ABSTRACT

Speeding up query evaluation in large XML repositories becomes a challenging and all-important problem with vast XML-related applications arising. Upon discovery of hot XML query patterns, indexing and caching can be effectively adopted for query performance enhancement. Previous algorithms for finding hot query patterns basically introduced a straightforward generate-and-test strategy. In this paper, we present, SOLARIA, an efficient algorithm for mining hot XML query patterns without candidate maintenance and costly tree-containment checking. Efficient algorithm of sequence mining is involved in discovering frequent tree-structured patterns, which aims at replacing expensive containment testing with cheap parent-child checking in sequences. SOLARIA deeply prunes unrelated search space for frequent pattern enumeration by parent-child relationship constraint. With the motivation of indexing and caching in XML query optimization, we also propose the derived algorithm SOLARIA for mining hot "closed" XML query patterns which provide compact and complete structure information. By a thorough experimental study on various real-life data, we demonstrate the efficiency and scalability of SOLARIA over the previous known alternative. SOLARIA is also linearly scalable in terms of XML queries' size.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
T. Asai, K. Abe, S. Kawasoe, et. al. Efficient Substructure Discovery from Large Semi-structured Data. Proc. of the 2nd SIAM Int. Conf. on Data Mining, 2002, Arlington, VA, USA.
3
 
4
C. Bettini, X. Wang, and S. Jajodia, Mining temporal relationals with multiple granularities in time sequences. Data Engineering Bulletin, 1998.
 
5
D. Chamberlin, D. Florescu, J. Robie, J. Simon, and M. Stefanescu. XQuery: A Query Language for XML W3C working draft, 2001.
6
 
7
J. Clark and S. DeRose. XML Path Language (XPath) version 1.0 W3C recommendation, 1999.
 
8
L. Dehaspe, H. Toivonen, R. D. King. Finding Frequent Substructures in Chemical Compounds. Proc. of 4th Int. Conf. on Knowledge Discovery and Data Mining, Aug. 1998, New York, New York, USA.
 
9
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
S. Picciotto. How to Encode a Tree. PhD thesis, University of California, San Diego, 1999.
 
18
 
19
 
20
 
21
 
22
W. Wang, H. Jiang, H. Lu, and J. X. Yu. PBiTree coding and efficient processing of containment joins. Proc. of the 19th Int. Conf. on Data Engineering, Mar. 2003, Bangalore, India.
23
 
24
X. Yan, J. Han, R. Afshar, CloSpan: Mining Closed Sequential Patterns in Large Databases. Proc. of the 3rd SIAM Int. Conf. on Data Mining, May 2003, San Francisco, CA, USA.
 
25
L. H. Yang, M. L. Lee, W. Hsu. Efficient Mining of XML Query Patterns for Caching. Proc. of the 29th Int. Conf. on Very Large Data Bases, Sept. 2003, Berlin, Germany.
 
26
27
28


Collaborative Colleagues:
Jianhua Feng: colleagues
Qian Qian: colleagues
Jianyong Wang: colleagues
Lizhu Zhou: colleagues