skip to main content
10.1145/3097983.3098142acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Learning from Labeled and Unlabeled Vertices in Networks

Published: 13 August 2017 Publication History

Abstract

Networks such as social networks, citation networks, protein-protein interaction networks, etc., are prevalent in real world. However, only very few vertices have labels compared to large amounts of unlabeled vertices. For example, in social networks, not every user provides his/her profile information such as the personal interests which are relevant for targeted advertising. Can we leverage the limited user information and friendship network wisely to infer the labels of unlabeled users?
In this paper, we propose a semi-supervised learning framework called weighted-vote Geometric Neighbor classifier (wvGN) to infer the likely labels of unlabeled vertices in sparsely labeled networks. wvGN exploits random walks to explore not only local but also global neighborhood information of a vertex. Then the label of the vertex is determined by the accumulated local and global neighborhood information. Specifically, wvGN optimizes a proposed objective function by a search strategy which is based on the gradient and coordinate descent methods. The search strategy iteratively conducts a coarse search and a fine search to escape from local optima. Extensive experiments on various synthetic and real-world data verify the effectiveness of wvGN compared to state-of-the-art approaches.

References

[1]
Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. 2006. Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples. Journal of Machine Learning Research Vol. 7 (2006), 2399--2434.
[2]
Christopher M Bishop. 2006. Pattern recognition and Machine Learning. Springer (2006).
[3]
Christopher J. C. Burges. 1998. A Tutorial on Support Vector Machines for Pattern Recognition. Data Min. Knowl. Discov. Vol. 2, 2 (1998), 121--167.
[4]
Kai-Wei Chang, Cho-Jui Hsieh, and Chih-Jen Lin. 2008. Coordinate descent method for large-scale l2-loss linear support vector machines. Journal of Machine Learning Research Vol. 9, Jul (2008), 1369--1398.
[5]
Fan Chung. 2007. The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences, Vol. 104, 50 (2007), 19735--19740.
[6]
Corinna Cortes and Vladimir Vapnik 1995. Support-Vector Networks. Machine Learning, Vol. 20, 3 (1995), 273--297.
[7]
Rong-En Fan and Chih-Jen Lin 2007. A study on threshold selection for multi-label classification. Department of Computer Science, National Taiwan University (2007), 1--23.
[8]
Brian Gallagher, Hanghang Tong, Tina Eliassi-Rad, and Christos Faloutsos 2008. Using ghost edges for classification in sparsely labeled networks SIGKDD. 256--264.
[9]
Lise Getoor. 2007. Introduction to statistical relational learning. MIT press.
[10]
Aditya Grover and Jure Leskovec 2016. node2vec: Scalable Feature Learning for Networks. SIGKDD. 855--864.
[11]
Jialong Han, Ji-Rong Wen, and Jian Pei 2014. Within-network classification using radius-constrained neighborhood patterns CIKM. 1539--1548.
[12]
Thorsten Joachims. 1999. Transductive Inference for Text Classification using Support Vector Machines ICML. 200--209.
[13]
Kyle Kloster and David F Gleich 2014. Heat kernel based community detection. In SIGKDD. ACM, 1386--1395.
[14]
Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. 2008. Benchmark graphs for testing community detection algorithms. Physical review E, Vol. 78, 4 (2008).
[15]
Frank Lin and William W Cohen 2010natexlaba. Semi-supervised classification of network data using very few labels ASONAM. 192--199.
[16]
Frank Lin and William W Cohen 2010natexlabb. A Very Fast Method for Clustering Big Text Datasets. ECAI. 303--308.
[17]
Sofus A Macskassy and Foster Provost 2003. A Simple Relational Classifier. In MRDM workshop at KDD.
[18]
Sofus A Macskassy and Foster Provost 2007. Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning Research Vol. 8, May (2007), 935--983.
[19]
Olvi L Mangasarian. 2002. A finite Newton method for classification. Optimization Methods and Software Vol. 17, 5 (2002), 913--929.
[20]
Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks. Annual review of sociology (2001), 415--444.
[21]
Sharad Nandanwar and M. Narasimha Murty 2016. Structural Neighborhood Based Classification of Nodes in a Network SIGKDD. 1085--1094.
[22]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations SIGKDD. ACM, 701--710.
[23]
John Platt and others. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers Vol. 10, 3 (1999), 61--74.
[24]
Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebro. 2007. Pegasos: Primal estimated sub-gradient solver for svm ICML. ACM, 807--814.
[25]
Lei Tang and Huan Liu. 2009. Relational learning via latent social dimensions. SIGKDD. ACM, 817--826.
[26]
Lei Tang and Huan Liu. 2009. Scalable learning of collective behavior based on sparse social dimensions CIKM. ACM, 1107--1116.
[27]
Vladimir Naumovich Vapnik and Vlamimir Vapnik 1998. Statistical learning theory. Vol. Vol. 1. Wiley New York.
[28]
Fei Wang and Changshui Zhang 2006. Label propagation through linear neighborhoods. In ICML. 985--992.
[29]
Xi Wang and Gita Sukthankar 2013. Multi-label relational neighbor classification using social context features SIGKDD. ACM, 464--472.
[30]
Wayne W Zachary. 1977. An information flow model for conflict and fission in small groups. Journal of anthropological research Vol. 33, 4 (1977), 452--473.
[31]
Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Schölkopf. 2003. Learning with Local and Global Consistency. In NIPS. 321--328.

Cited By

View all
  • (2021)Fast and Accurate Anchor Graph-based Label PredictionProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482258(504-513)Online publication date: 26-Oct-2021
  • (2021)Community-based influence maximization in location-based social networkWorld Wide Web10.1007/s11280-021-00935-x24:6(1903-1928)Online publication date: 1-Nov-2021
  • (2020)Improving graph-based label propagation algorithm with group partition for fraud detectionApplied Intelligence10.1007/s10489-020-01724-150:10(3291-3300)Online publication date: 26-May-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2017
2240 pages
ISBN:9781450348874
DOI:10.1145/3097983
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. geometric neighborhood
  2. random walk
  3. regularization
  4. semi-supervised learning
  5. support vector machines

Qualifiers

  • Research-article

Conference

KDD '17
Sponsor:

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Fast and Accurate Anchor Graph-based Label PredictionProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482258(504-513)Online publication date: 26-Oct-2021
  • (2021)Community-based influence maximization in location-based social networkWorld Wide Web10.1007/s11280-021-00935-x24:6(1903-1928)Online publication date: 1-Nov-2021
  • (2020)Improving graph-based label propagation algorithm with group partition for fraud detectionApplied Intelligence10.1007/s10489-020-01724-150:10(3291-3300)Online publication date: 26-May-2020
  • (2019)Efficiently Approximating Top-$k$ Sequential Patterns in Transactional GraphsIEEE Access10.1109/ACCESS.2019.29168117(62817-62832)Online publication date: 2019
  • (2019)Mining top-k sequential patterns in transaction database graphsWorld Wide Web10.1007/s11280-019-00686-w23:1(103-130)Online publication date: 3-May-2019
  • (2018)A Multi-Label Threshold Learning Framework for Propagation Algorithms on a Non-Feature Network2018 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOCOM.2018.8647303(1-6)Online publication date: 9-Dec-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media