|
ABSTRACT
We present an algorithm for mining tree-shaped patterns in a large graph. Novel about our class of patterns is that they can contain constants, and can contain existential nodes which are not counted when determining the number of occurrences of the pattern in the graph. Our algorithm has a number of provable optimality properties, which are based on the theory of conjunctive database queries. We propose a database-oriented implementation in SQL, and report upon some initial experimental results obtained with our implementation on graph data about food webs, about protein interactions, and about citation analysis.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
D.J. Cook and L.B. Holder. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1:231--255, 1994.
|
| |
8
|
|
| |
9
|
|
| |
10
|
M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. In M.W. Berry, U.Dayal, C.Kamath, and D.B. Skillicorn, editors, Proceedings of the Fourth SIAM International Conference on Data Mining. SIAM, 2004.
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
H.I. Scions. Placing trees in lexicographic order. In D. Michie, editor, Machine Intelligence 3, pages 43--62. Edinburgh University Press, 1968.
|
| |
16
|
|
| |
17
|
|
 |
18
|
Dick Tsur , Jeffrey D. Ullman , Serge Abiteboul , Chris Clifton , Rajeev Motwani , Svetlozar Nestorov , Arnon Rosenthal, Query flocks: a generalization of association-rule mining, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.1-12, June 01-04, 1998, Seattle, Washington, United States
|
| |
19
|
|
| |
20
|
S. Chakravarthy, R. Beera, and R. Balachandran. DB-Subdue: Database approach to graph mining. In H. Dai, R. Srikant, and C. Zhang, editors, Advances in Knowledge Discovery and Data Mining, Proceedings 8th PAKDD Conference, volume 3056 of Lecture Notes in Computer Science, pages 341--350. Springer, 2004.
|
| |
21
|
Rakesh Agrawal , Hiekki Mannila , Ramakrishnan Srikant , Hannu Toivonen , A. Inkeri Verkamo, Fast discovery of association rules, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996
|
| |
22
|
|
| |
23
|
J. Memmott, N.D. Martinez, and J.E. Cohen. Predators, parasites and pathogens: species richness, trophic generality, and body sizes in a natural food web. Journal of Animal Ecology, 69:1--15, 2000.
|
| |
24
|
H. Jeong, S.P. Mason, et al. Lethality and centrality in protein networks. Nature, 411(3 May 2001).
|
| |
25
|
M. Newman. The structure and function of complex networks. SIAM Review, 45(2):167--256, 2003.
|
| |
26
|
R. Kumar , P. Raghavan , S. Rajagopalan , D. Sivakumar , A. Tomkins , E. Upfal, Stochastic models for the Web graph, Proceedings of the 41st Annual Symposium on Foundations of Computer Science, p.57, November 12-14, 2000
|
| |
27
|
|
| |
28
|
Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9-12 December 2002, Maebashi City, Japan. IEEE Computer Society Press, 2002.
|
| |
29
|
Nick Cercone , Tsau Y. Lin , Xindong Wu, Proceedings: 2001 IEEE International Conference on Data Mining, 29 November - 2 December 2001, San Jose, California, IEEE Computer Society Press, Los Alamitos, CA, 2001
|
|