|
ABSTRACT
Modern search engines answer keyword-based queries extremely efficiently. The impressive speed is due to clever inverted index structures, caching, a domain-independent knowledge of strings, and thousands of machines. Several research efforts have attempted to generalize keyword search to keytree and keygraph searching, because trees and graphs have many applications in next-generation database systems. This paper surveys both algorithms and applications, giving some emphasis to our own work.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The Lorel query language for semistructured data. International Journal on Digital Libraries, 1(1):68-88,1997.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
Sihem Amer-Yahia , SungRan Cho , Laks V. S. Lakshmanan , Divesh Srivastava, Minimization of tree pattern queries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.497-508, May 21-24, 2001, Santa Barbara, California, United States
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
D. Barbosa, A. Barta, A. O. Mendelzon, G. A. Mihaila, F. Rizzolo, and P. Rodriguez-Guianolli. ToX - The Toronto XML engine. In Proceedings of the Workshop on Information Integration on the Web, pages 66-73, 2001.
|
| |
14
|
H. Barrow and R. M. Burstall. Subgraph isomorphism, matching relational structures and maximal cliques. Information Processing Letters, 4:83-84, 1976.
|
| |
15
|
S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery 1.0: An XML Query Language; available at http://www.w3.org/TR/xquery/.
|
 |
16
|
|
 |
17
|
Peter Buneman , Susan Davidson , Gerd Hillebrand , Dan Suciu, A query language and optimization techniques for unstructured data, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.505-516, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
18
|
Peter Buneman , Wenfei Fan , Scott Weinstein, Path constraints on semistructured and structured data, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.129-138, June 01-04, 1998, Seattle, Washington, United States
[doi> 10.1145/275487.275502]
|
| |
19
|
|
| |
20
|
|
| |
21
|
H. Bunke and G. Allermann. Inexact graph matching for structural pattern recognition. Pattern Recognition Letters, 1(4):245-253, 1983.
|
| |
22
|
|
 |
23
|
|
 |
24
|
Sudarshan S. Chawathe , Anand Rajaraman , Hector Garcia-Molina , Jennifer Widom, Change detection in hierarchically structured information, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.493-504, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
25
|
Zhiyuan Chen , H. V. Jagadish , Flip Korn , Nick Koudas , S. Muthukrishnan , Raymond T. Ng , Divesh Srivastava, Counting Twig Matches in a Tree, Proceedings of the 17th International Conference on Data Engineering, p.595-604, April 02-06, 2001
|
 |
26
|
Zhiyuan Chen , Nick Koudas , Flip Korn , S. Muthukrishnan, Selectively estimation for Boolean queries, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.216-225, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335225]
|
| |
27
|
|
| |
28
|
Richard Cole , Ramesh Hariharan , Piotr Indyk, Tree pattern matching and subset matching in deterministic O(n log3 n)-time, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.245-254, January 17-19, 1999, Baltimore, Maryland, United States
|
 |
29
|
|
| |
30
|
D. J. Cook and L. B. Holder. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1:231-255, 1994.
|
| |
31
|
|
 |
32
|
|
| |
33
|
L. Dehaspe, H. Toivonen, and R. D. King. Finding frequent substructures in chemical compounds. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pages 30-36, 1998.
|
| |
34
|
Alin Deutsch , Mary Fernandez , Daniela Florescu , Alon Levy , Dan Suciu, A query language for XML, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.31 n.11-16, p.1155-1169, May 17, 1999
|
| |
35
|
|
| |
36
|
M. A. Eshera and K.-S. Fu. A graph distance measure for image analysis. IEEE Transactions on Systems, Man, and Cybernetics, 14(3):353-363, 1984.
|
| |
37
|
J. Feng, M. Laumy, and M. Dhome. Inexact matching using neural networks. In E. Gelsema and L. N. Kanal, editors, Pattern Recognition in Practice IV: Multiple Paradigms, Comparative Studies and Hybrid Systems. North-Holland, 1994.
|
 |
38
|
Mary Fernández , Daniela Florescu , Jaewoo Kang , Alon Levy , Dan Suciu, Catching the boat with Strudel: experiences with a Web-site management system, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.414-425, June 01-04, 1998, Seattle, Washington, United States
|
| |
39
|
|
| |
40
|
|
| |
41
|
|
| |
42
|
|
| |
43
|
|
| |
44
|
|
| |
45
|
|
| |
46
|
|
 |
47
|
Marc Gyssens , Jan Paredaens , Dirk van Gucht, A graph-oriented object database model, Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.417-424, April 02-04, 1990, Nashville, Tennessee, United States
[doi> 10.1145/298514.298593]
|
| |
48
|
|
 |
49
|
|
| |
50
|
|
| |
51
|
J. Jaakkola and P. Kilpelainen, Using sgrep for querying structured text files. University of Helsinki, Department of Computer Science, Report C-1996-83, November 1996; available at http://www.cs.helsinki. fi/u/jjaakkol/sgrep.html.
|
| |
52
|
|
 |
53
|
H. V. Jagadish , Laks V. S. Lakshmanan , Tova Milo , Divesh Srivastava , Dimitra Vista, Querying network directories, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.133-144, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
54
|
H. V. Jagadish , Raymond T. Ng , Divesh Srivastava, Substring selectivity estimation, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.249-260, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
[doi> 10.1145/303976.304001]
|
| |
55
|
C. A. James, D. Weininger, and J. Delany. Daylight Theory Manual-Daylight 4.71. Daylight Chemical Information Systems, www.daylight.com, 2000.
|
 |
56
|
|
| |
57
|
P. Kilpelainen. Tree matching problems with applications to structured text databases. Report A-1992-6, University of Helsinki, Finland, 1992.
|
 |
58
|
|
| |
59
|
|
| |
60
|
|
 |
61
|
P. Krishnan , Jeffrey Scott Vitter , Bala Iyer, Estimating alphanumeric selectivity in the presence of wildcards, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.282-293, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
62
|
Theodore W. Leung , Gail Mitchell , Bharathi Subramanian , Bennet Vance , Scott L. Vandenberg , Stanley B. Zdonik, The AQUA Data Model and Algebra, Proceedings of the Fourth International Workshop on Database Programming Languages - Object Models and Languages, p.157-175, August 30-September 01, 1993
|
| |
63
|
G. Levi. A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcols, 9:341-354, 1972.
|
| |
64
|
|
| |
65
|
|
| |
66
|
|
| |
67
|
|
| |
68
|
|
| |
69
|
|
| |
70
|
A. O. Mendelzon, G. A. Mihaila, and T. Milo. Querying the World Wide Web. International Journal on Digital Libraries, 1(1):54-67, 1997.
|
| |
71
|
|
 |
72
|
|
| |
73
|
|
| |
74
|
|
| |
75
|
|
| |
76
|
National Cancer Institute. http://www.nci.nih.gov/.
|
| |
77
|
|
 |
78
|
|
| |
79
|
|
| |
80
|
W. H. Piel, M. J. Donoghue, and M. J. Sanderson. TreeBASE: A database of phylogenetic information. In Proceedings of the 2nd International Workshop of Species 2000, Tsukuba, Japan, 2000.
|
 |
81
|
|
| |
82
|
A. Sanfeliu and K. Fu. A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, 13(3):353-362, 1983.
|
| |
83
|
T. Schlieder and F. Naumann, Approximate tree embedding for querying XML data. In Proceedings of the ACM SIGIR Workshop on XML and Information Retrieval, Athens, Greece, July 2000.
|
| |
84
|
D. Shasha, J. T. L. Wang, H. Shan, and K. Zhang. ATreeGrep: Approximate searching in unordered trees. Submitted, 2002.
|
 |
85
|
|
| |
86
|
|
| |
87
|
|
 |
88
|
|
| |
89
|
W. H. Tsai and K. S. Fu. Error-correcting isomorphism of attributed relational graphs for pattern analysis. IEEE Transactions on Systems, Man, and Cybernetics, 9:757-768, 1979.
|
 |
90
|
|
| |
91
|
|
 |
92
|
|
| |
93
|
|
| |
94
|
|
| |
95
|
J. T. L. Wang, B. A. Shapiro, D. Shasha, K. Zhang, and C.-Y. Chang. Automated discovery of active motifs in multiple RNA secondary structures. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pages 70-75, 1996.
|
 |
96
|
Jason Tsong-Li Wang , Dennis Shasha , George J. S. Chang , Liam Relihan , Kaizhong Zhang , Girish Patel, Structural matching and discovery in document databases, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.560-563, May 11-15, 1997, Tucson, Arizona, United States
|
| |
97
|
J. T. L. Wang, K. Zhang, G. Chang, and D. Shasha. Finding approximate patterns in undirected acyclic graphs. Pattern Recognition, 35(2):473-483, 2002.
|
| |
98
|
J. T. L. Wang, K. Zhang, and G.-W. Chirn. The approximate graph matching problem. In Proceedings of the International Conference on Pattern Recognition, volume 2, pages 284-288, 1994.
|
| |
99
|
|
| |
100
|
X. Wang, J. T. L. Wang, D. Shasha, B. A. Shapiro, S. Dikshitulu, I. Rigoutsos, and K. Zhang. Automated discovery of active motifs in three dimensional molecules. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, pages 89-95, 1997.
|
| |
101
|
|
| |
102
|
A. Wong and M. You. Entropy and distance of random graphs with application to structural pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7(5):599-609, 1985.
|
 |
103
|
|
| |
104
|
XML TreeDiff. http://www.alphaworks.ibm.com/aw.nsf/FAQs/xmltreediff.
|
 |
105
|
|
| |
106
|
|
| |
107
|
|
CITED BY 18
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Luciana S. Buriol , Gereon Frahling , Stefano Leonardi , Alberto Marchetti-Spaccamela , Christian Sohler, Counting triangles in data streams, Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 26-28, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bingjun Sun , Qingzhao Tan , Prasenjit Mitra , C. Lee Giles, Extraction and search of chemical formulae in text documents on the web, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|