research-article

Transfer Learning via Feature Isomorphism Discovery

Authors:

Lei ChenAuthors Info & Claims

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1301 - 1309

https://doi.org/10.1145/3219819.3220029

Published: 19 July 2018 Publication History

Abstract

Transfer learning has gained increasing attention due to the inferior performance of machine learning algorithms with insufficient training data. Most of the previous homogeneous or heterogeneous transfer learning works aim to learn a mapping function between feature spaces based on the inherent correspondence across the source and target domains or labeled instances. However, in many real world applications, existing methods may not be robust when the correspondence across domains is noisy or labeled instances are not representative. In this paper, we develop a novel transfer learning framework called Transfer Learning via Feature Isomorphism Discovery (abbreviated to TLFid), which owns high tolerance for noisy correspondence between domains as well as scarce or non-existing labeled instances. More specifically, we propose a feature isomorphism approach to discovering common substructures across feature spaces and learning a feature mapping function from the target domain to the source domain. We evaluate the performance of TLFid on the cross-lingual sentiment classification tasks. The results show that our method achieves significant improvement in terms of accuracy compared with the state-of-the-art methods.

References

[1]

Chih-Chung Chang and Chih-Jen Lin . 2011. LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST) Vol. 2, 3 (2011), 27.

Digital Library

[2]

Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa . 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research Vol. 12, Aug (2011), 2493--2537.

Digital Library

[3]

Donatello Conte, Pasquale Foggia, and Mario Vento . 2007. Challenging Complexity of Maximum Common Subgraph Detection Algorithms: A Performance Analysis of Three Algorithms on a Wide Database of Graphs. J. Graph Algorithms Appl. Vol. 11, 1 (2007), 99--143.

[4]

Stephen A Cook . 1971. The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on Theory of computing. ACM, 151--158.

Digital Library

[5]

Luigi P Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento . 2004. A (sub) graph isomorphism algorithm for matching large graphs. IEEE transactions on pattern analysis and machine intelligence Vol. 26, 10 (2004), 1367--1372.

Digital Library

[6]

Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu . 2009. Translated learning: Transfer learning across different feature spaces Advances in neural information processing systems. 353--360.

Digital Library

[7]

David L Donoho and Carrie Grimes . 2003. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences Vol. 100, 10 (2003), 5591--5596.

[8]

Lixin Duan, Dong Xu, and Ivor Tsang . 2012. Learning with augmented features for heterogeneous domain adaptation. arXiv preprint arXiv:1206.4660 (2012).

Digital Library

[9]

Leo Egghe and Loet Leydesdorff . 2009. The relation between Pearson's correlation coefficient r and Salton's cosine measure. Journal of the Association for Information Science and Technology Vol. 60, 5 (2009), 1027--1036.

Digital Library

[10]

James D Evans . 1996. Straightforward statistics for the behavioral sciences. Brooks/Cole.

[11]

Alejandro Moreo Fernández, Andrea Esuli, and Fabrizio Sebastiani . 2016. Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification. Journal of artificial intelligence research Vol. 55 (2016), 131--163.

Digital Library

[12]

Aditya Grover and Jure Leskovec . 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 855--864.

Digital Library

[13]

Sepp Hochreiter and Jürgen Schmidhuber . 1997. Long short-term memory. Neural computation Vol. 9, 8 (1997), 1735--1780.

Digital Library

[14]

Judy Hoffman, Erik Rodner, Jeff Donahue, Trevor Darrell, and Kate Saenko . 2013. Efficient learning of domain-invariant image representations. arXiv preprint arXiv:1301.3224 (2013).

[15]

David G Lowe . 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision Vol. 60, 2 (2004), 91--110.

Digital Library

[16]

Laurens van der Maaten and Geoffrey Hinton . 2008. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9, Nov (2008), 2579--2605.

[17]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119.

Digital Library

[18]

Sinno Jialin Pan, Ivor W Tsang, James T Kwok, and Qiang Yang . 2011. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks Vol. 22, 2 (2011), 199--210.

Digital Library

[19]

Sinno Jialin Pan and Qiang Yang . 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering Vol. 22, 10 (2010), 1345--1359.

Digital Library

[20]

Peter Prettenhofer and Benno Stein . 2010. Cross-language text classification using structural correspondence learning Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics, 1118--1127.

Digital Library

[21]

Changqin Quan and Fuji Ren . 2010. A blog emotion corpus for emotional expression analysis in Chinese. Computer Speech & Language Vol. 24, 4 (2010), 726--749.

Digital Library

[22]

Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y Ng . 2007. Self-taught learning: transfer learning from unlabeled data Proceedings of the 24th international conference on Machine learning. ACM, 759--766.

Digital Library

[23]

Sam T Roweis and Lawrence K Saul . 2000. Nonlinear dimensionality reduction by locally linear embedding. science Vol. 290, 5500 (2000), 2323--2326.

[24]

Xiaoxiao Shi, Qi Liu, Wei Fan, and S Yu Philip . 2013. Transfer across completely different feature spaces via spectral embedding. IEEE Transactions on Knowledge and Data Engineering Vol. 25, 4 (2013), 906--918.

Digital Library

[25]

Joshua B Tenenbaum, Vin De Silva, and John C Langford . 2000. A global geometric framework for nonlinear dimensionality reduction. science Vol. 290, 5500 (2000), 2319--2323.

[26]

Chang Wang and Sridhar Mahadevan . 2011. Heterogeneous domain adaptation using manifold alignment IJCAI Proceedings-International Joint Conference on Artificial Intelligence, Vol. Vol. 22. 1541.

Digital Library

[27]

Min Xiao and Yuhong Guo . 2014. Semi-Supervised Matrix Completion for Cross-Lingual Text Classification. AAAI. 1607--1614.

Digital Library

[28]

Joey Tianyi Zhou, Sinno Jialin Pan, Ivor W Tsang, and Yan Yan . 2014. Hybrid Heterogeneous Transfer Learning through Deep Learning. AAAI. 2213--2220.

Digital Library

Cited By

Farhadi AMirzarezaee MSharifi ATeshnehlab M(2024)Domain adaptation in reinforcement learning: a comprehensive and systematic study综述: 强化学习中的领域适应Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.230066825:11(1446-1465)Online publication date: 27-Dec-2024
https://doi.org/10.1631/FITEE.2300668
LIU HDI SLI HLI SCHEN LZHOU X(2024)Effective Data Selection and Replay for Unsupervised Continual Learning2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00119(1449-1463)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00119
Liu HDi SChen L(2023)Incremental Tabular Learning on Heterogeneous Feature SpaceProceedings of the ACM on Management of Data10.1145/35886981:1(1-18)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588698
Show More Cited By

Index Terms

Transfer Learning via Feature Isomorphism Discovery

Recommendations

Spectral domain-transfer learning
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

Traditional spectral classification has been proved to be effective in dealing with both labeled and unlabeled data when these data are from the same domain. In many real world applications, however, we wish to make use of the labeled data from one ...
Transfer Learning from Unlabeled Data via Neural Networks

A machine learning framework which uses unlabeled data from a related task domain in supervised classification tasks is described. The unlabeled data come from related domains, which share the same class labels or generative distribution as the labeled ...
Transfer active learning
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Active learning traditionally assumes that labeled and unlabeled samples are subject to the same distributions and the goal of an active learner is to label the most informative unlabeled samples. In reality, situations may exist that we may not have ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2018

2925 pages

ISBN:9781450355520

DOI:10.1145/3219819

General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Hong Kong ITC
Hong Kong RGC

Conference

KDD '18

Sponsor:

KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 19 - 23, 2018

London, United Kingdom

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
902
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Farhadi AMirzarezaee MSharifi ATeshnehlab M(2024)Domain adaptation in reinforcement learning: a comprehensive and systematic study综述: 强化学习中的领域适应Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.230066825:11(1446-1465)Online publication date: 27-Dec-2024
https://doi.org/10.1631/FITEE.2300668
LIU HDI SLI HLI SCHEN LZHOU X(2024)Effective Data Selection and Replay for Unsupervised Continual Learning2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00119(1449-1463)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00119
Liu HDi SChen L(2023)Incremental Tabular Learning on Heterogeneous Feature SpaceProceedings of the ACM on Management of Data10.1145/35886981:1(1-18)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588698
Xin XShi CSong TLiu HXue ZSong J(2023)Intuitionistic fuzzy three-way transfer learning based on rough almost stochastic dominanceEngineering Applications of Artificial Intelligence10.1016/j.engappai.2022.105659118(105659)Online publication date: Feb-2023
https://doi.org/10.1016/j.engappai.2022.105659
Yang SWang HZhang YLi PZhu YHu X(2019)Semi-supervised representation learning via dual autoencoders for domain adaptationKnowledge-Based Systems10.1016/j.knosys.2019.105161(105161)Online publication date: Oct-2019
https://doi.org/10.1016/j.knosys.2019.105161

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents