ABSTRACT
SimRank is an effective and widely adopted measure to quantify the structural similarity between pairs of nodes in a graph. In this paper we study the problem of top-k SimRank-based similarity join, which finds k pairs of nodes with the largest SimRank values. To the best of our knowledge, this is the first attempt to address this problem. We propose a random-walk-based method to efficiently identify top-k pairs. Experiment results on real datasets show that our method significantly outperforms baseline approaches.
- W. Zheng, L. Zou, Y. Feng, L. Chen, and D. Zhao, "Efficient simrank-based similarity join over large graphs," PVLDB, vol. 6, no. 7, pp. 493--504, 2013. Google ScholarDigital Library
- D. Lizorkin, P. Velikhov, M. N. Grinev, and D. Turdakov, "Accuracy estimate and optimization techniques for simrank computation," VLDB J., vol. 19, no. 1, pp. 45--66, 2010. Google ScholarDigital Library
- W. Yu, X. Lin, and W. Zhang, "Towards efficient simrank computation on large networks," in ICDE, pp. 601--612, 2013. Google ScholarDigital Library
- G. Jeh and J. Widom, "Simrank: a measure of structural-context similarity," in KDD, pp. 538--543, 2002. Google ScholarDigital Library
- Y. Low and A. X. Zheng, "Fast top-k similarity queries via matrix compression," in CIKM, pp. 2070--2074, 2012. Google ScholarDigital Library
Index Terms
- Efficient top-K SimRank-based similarity join
Recommendations
Scalable similarity search for SimRank
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataSimRank, proposed by Jeh and Widom, provides a good similarity score and has been successfully used in many of the above mentioned applications. While there are many algorithms proposed so far to compute SimRank, but unfortunately, none of them are ...
Efficient top-k simrank-based similarity join
SimRank is a popular and widely-adopted similarity measure to evaluate the similarity between nodes in a graph. It is time and space consuming to compute the SimRank similarities for all pairs of nodes, especially for large graphs. In real-world ...
Efficient SimRank-Based Similarity Join
Invited Paper from SIGMOD 2015, Invited Paper from PODS 2015, Regular Papers and Technical CorrespondenceGraphs have been widely used to model complex data in many real-world applications. Answering vertex join queries over large graphs is meaningful and interesting, which can benefit friend recommendation in social networks and link prediction, and so on. ...
Comments