skip to main content
10.1145/2872427.2883007acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Public Access

What Links Alice and Bob?: Matching and Ranking Semantic Patterns in Heterogeneous Networks

Published: 11 April 2016 Publication History

Abstract

An increasing number of applications are modeled and analyzed in network form, where nodes represent entities of interest and edges represent interactions or relationships between entities. Commonly, such relationship analysis tools assume homogeneity in both node type and edge type. Recent research has sought to redress the assumption of homogeneity and focused on mining heterogeneous information networks (HINs) where both nodes and edges can be of different types. Building on such efforts, in this work we articulate a novel approach for mining relationships across entities in such networks while accounting for user preference (prioritization) over relationship type and interestingness metric. We formalize the problem as a top-$k$ lightest paths problem, contextualized in a real-world communication network, and seek to find the $k$ most interesting path instances matching the preferred relationship type. Our solution, PROphetic HEuristic Algorithm for Path Searching (PRO-HEAPS), leverages a combination of novel graph preprocessing techniques, well designed heuristics and the venerable A* search algorithm. We run our algorithm on real-world large-scale graphs and show that our algorithm significantly outperforms a wide variety of baseline approaches with speedups as large as 100X. We also conduct a case study and demonstrate valuable applications of our algorithm.

References

[1]
T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In SIGMOD, pages 349--360. ACM, 2013.
[2]
B. Aleman-Meza, C. Halaschek-Wiener, S. S. Sahoo, A. Sheth, and I. B. Arpinar. Template based semantic similarity for security applications. In Intelligence and Security Informatics, pages 621--622. Springer, 2005.
[3]
N. Alon, R. Yuster, and U. Zwick. Color-coding. Journal of the ACM (JACM), 42(4):844--856, 1995.
[4]
Y. Bai, C. Wang, X. Ying, M. Wang, and Y. Gong. Path pattern query processing on large graphs. In IEEE BdCloud. IEEE, 2014.
[5]
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE. IEEE, 2002.
[6]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[7]
T. Coffman, S. Greenblatt, and S. Marcus. Graph-based technologies for intelligence analysis. Communications of the ACM, 47(3):45--47, 2004.
[8]
D. J. Cook and L. B. Holder. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, pages 231--255, 1994.
[9]
C. Faloutsos, K. S. McCurley, and A. Tomkins. Fast discovery of connection subgraphs. In SIGKDD, pages 118--127. ACM, 2004.
[10]
L. Fang, A. D. Sarma, C. Yu, and P. Bohannon. Rex: explaining relationships between entity pairs. Proceedings of the VLDB Endowment, 5(3), 2011.
[11]
B. Gallagher. The state of the art in graph-based pattern matching. United States. Department of Energy, 2006.
[12]
M. R. Garey and D. S. Johnson. Computers and intractability, volume 29. wh freeman, 2002.
[13]
R. Giugno and D. Shasha. Graphgrep: A fast and universal method for querying graphs. In Pattern Recognition, 2002. Proceedings. 16th International Conference on, volume 2, pages 112--115. IEEE, 2002.
[14]
E. Hadjiconstantinou and N. Christofides. An efficient implementation of an algorithm for finding k shortest simple paths. Networks, (34.2):88--101, 1999.
[15]
J. Hershberger, M. Maxel, and S. Suri. Finding the k shortest simple paths: A new algorithm and its implementation. ACM Transactions on Algorithms (TALG), 3(4):45, 2007.
[16]
V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516, 2005.
[17]
N. Katoh, I. Toshihide, and M. Hisashi. An efficient algorithm for k shortest simple paths. Networks, (12.4):411--427, 1982.
[18]
A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood based fast graph search in large networks. In SIGMOD, pages 901--912. ACM, 2011.
[19]
A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. In VLDB, volume 6, pages 181--192. VLDB Endowment, 2013.
[20]
N. Lao and W. W. Cohen. Relational retrieval using a combination of path-constrained random walks. Machine learning, 81(1):53--67, 2010.
[21]
K. Liu, K. Das, T. Grandison, and H. Kargupta. Privacy-preserving data analysis on graphs and social networks, 2008.
[22]
C. Meng, R. Cheng, S. Maniu, P. Senellart, and W. Zhang. Discovering meta-paths in large heterogeneous information networks. In WWW, pages 754--764. International World Wide Web Conferences Steering Committee, 2015.
[23]
J. Pearl. Heuristics: intelligent search strategies for computer problem solving. 1984.
[24]
J. Scott, T. Ideker, R. M. Karp, and R. Sharan. Efficient algorithms for detecting signaling pathways in protein interaction networks. Journal of Computational Biology, 13(2):133--144, 2006.
[25]
D. Shasha, J. T. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In ACM SIGMOD symposium on Principles of database systems, pages 39--52. ACM, 2002.
[26]
C. Shi, X. Kong, Y. Huang, P. S. Yu, and B. Wu. Hetesim: A general framework for relevance measure in heterogeneous networks. TKDE, 26(10):2479--2492, 2014.
[27]
Y.-K. Shih and S. Parthasarathy. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network. Bioinformatics, 28(12):i49--i58, 2012.
[28]
Y. Sun and J. Han. Mining heterogeneous information networks: a structural analysis approach. ACM SIGKDD Explorations Newsletter, 14(2):20--28, 2013.
[29]
Y. Sun, J. Han, C. C. Aggarwal, and N. V. Chawla. When will it happen?: relationship prediction in heterogeneous information networks. In WSDM, pages 663--672. ACM, 2012.
[30]
Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB, 2011.
[31]
H. Tong and C. Faloutsos. Center-piece subgraphs: problem definition and fast solutions. In SIGKDD. ACM, 2006.
[32]
J. R. Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM (JACM), 23(1):31--42, 1976.
[33]
A. Ullrich and C. V. Forst. k-pathA: k-shortest path algorithm. In IEEE International workshop on High Performance Computational Systems Biology, 2009.
[34]
M. Wolverton, P. Berry, I. W. Harrison, J. D. Lowrance, D. N. Morley, A. C. Rodriguez, E. H. Ruspini, and J. Thomere. Law: A workbench for approximate pattern matching in relational data. In IAAI, volume 3, pages 143--150, 2003.
[35]
J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, (17.11):712--716, 1971.
[36]
B. Zhou, J. Pei, and W. Luk. A brief survey on anonymization techniques for privacy preserving publishing of social network data. ACM SIGKDD Explorations Newsletter, 10(2):12--22, 2008.

Cited By

View all
  • (2021)Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding RepresentationACM Transactions on Knowledge Discovery from Data10.1145/346908316:2(1-18)Online publication date: 3-Sep-2021
  • (2020)A survey of typical attributed graph queriesWorld Wide Web10.1007/s11280-020-00849-024:1(297-346)Online publication date: 20-Nov-2020
  • (2019)Declarative User Selection with Soft ConstraintsProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3358025(931-940)Online publication date: 3-Nov-2019
  • Show More Cited By

Index Terms

  1. What Links Alice and Bob?: Matching and Ranking Semantic Patterns in Heterogeneous Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '16: Proceedings of the 25th International Conference on World Wide Web
    April 2016
    1482 pages
    ISBN:9781450341431

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 11 April 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. graph algorithms
    2. heterogeneous information networks
    3. semantic relationship queries

    Qualifiers

    • Research-article

    Funding Sources

    • NSF

    Conference

    WWW '16
    Sponsor:
    • IW3C2
    WWW '16: 25th International World Wide Web Conference
    April 11 - 15, 2016
    Québec, Montréal, Canada

    Acceptance Rates

    WWW '16 Paper Acceptance Rate 115 of 727 submissions, 16%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)63
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Con&Net: A Cross-Network Anchor Link Discovery Method Based on Embedding RepresentationACM Transactions on Knowledge Discovery from Data10.1145/346908316:2(1-18)Online publication date: 3-Sep-2021
    • (2020)A survey of typical attributed graph queriesWorld Wide Web10.1007/s11280-020-00849-024:1(297-346)Online publication date: 20-Nov-2020
    • (2019)Declarative User Selection with Soft ConstraintsProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3358025(931-940)Online publication date: 3-Nov-2019
    • (2019)FAIRYProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290990(240-248)Online publication date: 30-Jan-2019
    • (2019)Improvement of PRO-HEAPS Algorithm to Analyze Interaction Changes in Heterogeneous Graph2019 International Conference on Data and Software Engineering (ICoDSE)10.1109/ICoDSE48700.2019.9092735(1-6)Online publication date: Nov-2019
    • (2018)Any-kProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186115(489-498)Online publication date: 10-Apr-2018
    • (2018)Prioritized Relationship Analysis in Heterogeneous Information NetworksACM Transactions on Knowledge Discovery from Data10.1145/315440112:3(1-27)Online publication date: 23-Jan-2018
    • (2017)Future Research DirectionsHeterogeneous Information Network Analysis and Applications10.1007/978-3-319-56212-4_9(219-227)Online publication date: 26-May-2017

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media