skip to main content
10.1145/2566486.2567986acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Random walks based modularity: application to semi-supervised learning

Published: 07 April 2014 Publication History

Abstract

Although criticized for some of its limitations, modularity remains a standard measure for analyzing social networks. Quantifying the statistical surprise in the arrangement of the edges of the network has led to simple and powerful algorithms. However, relying solely on the distribution of edges instead of more complex structures such as paths limits the extent of modularity. Indeed, recent studies have shown restrictions of optimizing modularity, for instance its resolution limit. We introduce here a novel, formal and well-defined modularity measure based on random walks. We show how this modularity can be computed from paths induced by the graph instead of the traditionally used edges. We argue that by computing modularity on paths instead of edges, more informative features can be extracted from the network. We verify this hypothesis on a semi-supervised classification procedure of the nodes in the network, where we show that, under the same settings, the features of the random walk modularity help to classify better than the features of the usual modularity. Additionally, the proposed approach outperforms the classical label propagation procedure on two data sets of labeled social networks.

References

[1]
A. Arenas, A. Fernandez, S. Fortunato, and S. Gomez. Motif-based communities in complex networks. Journal of Physics A: Mathematical and Theoretical, 41(22):224001, 2008.
[2]
A. Arenas, A. Fernandez, and S. Gomez. Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, 10(5):053039, 2008.
[3]
R. Bekkerman, M. Bilenko, and J. Langford. Scaling up machine learning: Parallel and distributed approaches. Cambridge University Press, 2011.
[4]
V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.
[5]
A. Clauset, M. E. Newman, and C. Moore. Finding community structure in very large networks. Physical review E, 70(6):066111, 2004.
[6]
J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae, J. Qiu, and G. Fox. Twister: a runtime for iterative mapreduce. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pages 810--818. ACM, 2010.
[7]
S. Fortunato. Community detection in graphs. Physics Reports, 486(3):75--174, 2010.
[8]
S. Fortunato and M. Barthéelemy. Resolution limit in community detection. Proceedings of the National Academy of Sciences, 104(1):36, 2007.
[9]
F. Fouss, K. Francoisse, L. Yen, A. Pirotte, and M. Saerens. An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification. Neural Networks, 31:53--72, 2012.
[10]
K. Francoisse, I. Kivimaki, A. Mantrach, F. Rossi, and M. Saerens. A bag-of-paths framework for network data analysis. Submitted for publication; available on ArXiv as ArXiv:1302.6766, pages 1--36, 2013.
[11]
R. Ghosh and K. Lerman. Community detection using a measure of global influence. In Advances in Social Network Mining and Analysis, pages 20--35. Springer, 2010.
[12]
E. T. Jaynes. Information theory and statistical mechanics. Physical review, 106(4):620, 1957.
[13]
J. Kandola, J. Shawe-Taylor, and N. Cristianini. Learning semantic similarity. Advances in neural information processing systems, 15:657--664, 2002.
[14]
A. Lancichinetti and S. Fortunato. Limits of modularity maximization in community detection. Physical Review E, 84(6):066122, 2011.
[15]
A. Lancichinetti, S. Fortunato, and F. Radicchi. Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4):046110, 2008.
[16]
R. B. Lehoucq, D. C. Sorensen, and C. Yang. ARPACK users' guide: solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods, volume 6. Siam, 1998.
[17]
J. Leskovec, K. J. Lang, and M. Mahoney. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th international Conference on World Wide Web, pages 631--640. ACM, 2010.
[18]
J. Lin and M. Schatz. Design patterns for efficient graph algorithms in mapreduce. In Proceedings of the Eighth Workshop on Mining and Learning with Graphs, pages 78--85. ACM, 2010.
[19]
H. B. Mann and D. R. Whitney. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 18(1):50--60, 1947.
[20]
C. D. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval. Cambridge University Press Cambridge, 2008.
[21]
A. Mantrach, L. Yen, J. Callut, K. Francoisse, M. Shimbo, and M. Saerens. The sum-over-paths covariance kernel: A novel covariance measure between nodes of a directed graph. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(6):1112--1126, 2010.
[22]
M. Newman. Networks: an introduction. OUP Oxford, 2009.
[23]
M. E. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical review E, 74(3):036104, 2006.
[24]
M. E. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23):8577--8582, 2006.
[25]
M. E. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical review E, 69(2):026113, 2004.
[26]
A. Ozgur, L. Ozgur, and T. Gungor. Text categorization with class-based and corpus-based keyword selection. In Computer and Information Sciences-ISCIS 2005, pages 606--615. Springer, 2005.
[27]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.
[28]
J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Automatic multimedia cross-modal correlation discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 653--658. ACM, 2004.
[29]
J. Reichardt and S. Bornholdt. Statistical mechanics of community detection. Physical Review E, 74(1):016110, 2006.
[30]
Y. Saad. Numerical methods for large eigenvalue problems, volume 158. SIAM, 1992.
[31]
M. Senelle, S. García-Díez, A. Mantrach, M. Shimbo, M. Saerens, and F. Fouss. The sum-over-forests density index: identifying dense regions in a graph. Pattern Analysis and Machine Intelligence, IEEE Transactions on, To appear, 2014.
[32]
L. Tang and H. Liu. Relational learning via latent social dimensions. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 817--826. ACM, 2009.
[33]
L. Tang and H. Liu. Scalable learning of collective behavior based on sparse social dimensions. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1107--1116. ACM, 2009.
[34]
L. Tang, X. Wang, H. Liu, and L. Wang. A multi-resolution approach to learning with overlapping communities. In Proceedings of the First Workshop on Social Media Analytics, pages 14--22. ACM, 2010.
[35]
H. Tong, C. Faloutsos, and J.-Y. Pan. Random walk with restart: fast solutions and applications. Knowledge and Information Systems, 14(3):327--346, 2008.
[36]
J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, page 3. ACM, 2012.
[37]
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. Advances in neural information processing systems, 16(753760):284, 2004.
[38]
X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical report, Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002.

Cited By

View all
  • (2025)On Graph Representation for Attributed Hypergraph ClusteringProceedings of the ACM on Management of Data10.1145/37097413:1(1-26)Online publication date: 11-Feb-2025
  • (2022)Free Energy Node Embedding via Generalized Skip-gram with Negative SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3206175(1-13)Online publication date: 2022
  • (2022)A Simple Extension of the Bag-of-Paths Model Weighting Path Lengths by a Poisson DistributionComplex Networks & Their Applications X10.1007/978-3-030-93409-5_19(220-233)Online publication date: 1-Jan-2022
  • Show More Cited By

Index Terms

  1. Random walks based modularity: application to semi-supervised learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '14: Proceedings of the 23rd international conference on World wide web
      April 2014
      926 pages
      ISBN:9781450327442
      DOI:10.1145/2566486

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 April 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. graph mining
      2. modularity
      3. random walk
      4. semi-supervised learning
      5. social networks
      6. statistical physics

      Qualifiers

      • Research-article

      Funding Sources

      • Spanish Centre for the Development of Industrial Technology under the CENIT program
      • Belgian Fonds pour la Recherche dans l'Industrie et l'Agriculture (FRIA)
      • "Social Media", and by projects with the "Region Wallonne" and the "Region Bruxelloise INOVIRIS"
      • Seventh Framework Programme

      Conference

      WWW '14
      Sponsor:
      • IW3C2

      Acceptance Rates

      WWW '14 Paper Acceptance Rate 84 of 645 submissions, 13%;
      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 08 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)On Graph Representation for Attributed Hypergraph ClusteringProceedings of the ACM on Management of Data10.1145/37097413:1(1-26)Online publication date: 11-Feb-2025
      • (2022)Free Energy Node Embedding via Generalized Skip-gram with Negative SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3206175(1-13)Online publication date: 2022
      • (2022)A Simple Extension of the Bag-of-Paths Model Weighting Path Lengths by a Poisson DistributionComplex Networks & Their Applications X10.1007/978-3-030-93409-5_19(220-233)Online publication date: 1-Jan-2022
      • (2021)Covariance and correlation measures on a graph in a generalized bag-of-paths formalismJournal of Complex Networks10.1093/comnet/cnaa0268:6Online publication date: 7-Mar-2021
      • (2020)An experimental study of graph-based semi-supervised classification with additional node informationKnowledge and Information Systems10.1007/s10115-020-01500-0Online publication date: 9-Oct-2020
      • (2019)A Survey on Personalized PageRank Computation AlgorithmsIEEE Access10.1109/ACCESS.2019.29526537(163049-163062)Online publication date: 2019
      • (2018)Representation Learning for Classification in Heterogeneous Graphs with Application to Social NetworksACM Transactions on Knowledge Discovery from Data10.1145/320160312:5(1-33)Online publication date: 20-Jul-2018
      • (2018)Soft Image Segmentation: On the Clustering of Irregular, Weighted, Multivariate Marked NetworksGeographical Information Systems Theory, Applications and Management10.1007/978-3-030-06010-7_6(85-109)Online publication date: 30-Dec-2018
      • (2017)Multilabel user classification using the community structure of online networksPLOS ONE10.1371/journal.pone.017334712:3(e0173347)Online publication date: 9-Mar-2017
      • (2017)Making relationships to links — Networked community, a connected society2017 4th International Conference on Systems and Informatics (ICSAI)10.1109/ICSAI.2017.8248439(1040-1048)Online publication date: Nov-2017
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media