ABSTRACT
Web scale classification problems, such as Web page tagging and E-commerce product recommendation, are typically regarded as multi-label classification with an extremely large number of labels. In this paper, we propose GPT, which is a novel tree-based approach for extreme multi-label learning. GPT recursively splits a feature space with a hyperplane at each internal node, considering approximate k-nearest neighbor graph on the label space. We learn the linear binary classifiers using a simple optimization procedure. We conducted evaluations on several large-scale real-world data sets and compared our proposed method with recent state-of-the-art methods. Experimental results demonstrate the effectiveness of our proposed method.
- J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 2011. Google ScholarDigital Library
- H. Jain, Y. Prabhu, and M. Varma. Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In KDD, 2016. Google ScholarDigital Library
- K. Jasinska, K. Dembczy\'nski, R. Busa-Fekete, K. Pfannschmidt, T. Klerx, and E. Hüllermeier. Extreme f-measure maximization using sparse probability estimates. In ICML, 2016. Google ScholarDigital Library
- H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In AISTATS, 2011.Google Scholar
- Y. Prabhu and M. Varma. FastXML: A fast, accurate and stable tree-classifier for extreme multi-label learning. In KDD, 2014. Google ScholarDigital Library
Index Terms
- Learning Extreme Multi-label Tree-classifier via Nearest Neighbor Graph Partitioning
Recommendations
Transductive Multilabel Learning via Label Set Propagation
The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Semi-supervised multi-label image classification based on nearest neighbor editing
Semi-supervised multi-label classification has been applied to many real-world applications such as image classification, document classification and so on. In semi-supervised learning, unlabeled samples are added to the training set for enhancing the ...
Generalized Zero-Shot Extreme Multi-label Learning
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningExtreme Multi-label Learning (XML) involves assigning the subset of most relevant labels to a data point from millions of label choices. A hitherto unaddressed challenge in XML is that of predicting unseen labels with no training points. These form a ...
Comments