|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ABSTRACT
Recently, due to its wide applications, subgraph search has attracted a lot of attention from database and data mining community. Sub-graph search is defined as follows: given a query graph Q, we report all data graphs containing Q in the database. However, there is little work about sub-graph search in a single large graph, which has been used in many applications, such as biological network and social network. In this paper, we address top-k sub-graph matching query problem, which is defined as follows: given a query graph Q, we locate top-k matchings of Q in a large data graph G according to a score function. The score function is defined as the sum of the pairwise similarity between a vertex in Q and its matching vertex in G. Specifically, we first design a balanced tree (that is G-Tree) to index the large data graph. Then, based on G-Tree, we propose an efficient query algorithm (that is Ranked Matching algorithm). Our extensive experiment results show that, due to efficiency of pruning strategy, given a query with up to 20 vertices, we can locate the top-100 matchings in less than 10 seconds in a large data graph with 100K vertices. Furthermore, our approach outperforms the alternative method by orders of magnitude. REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
INDEX TERMS
Primary Classification:
General Terms:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||