Article

Learning a meta-level prior for feature relevance from multiple related tasks

Authors:
Su-In Lee

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Vassil Chatalbashev

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
David Vickrey

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Daphne Koller

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

ICML '07: Proceedings of the 24th international conference on Machine learningJune 2007Pages 489–496https://doi.org/10.1145/1273496.1273558

Published:20 June 2007Publication History

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 489–496

ABSTRACT

In many prediction tasks, selecting relevant features is essential for achieving good generalization performance. Most feature selection algorithms consider all features to be a priori equally likely to be relevant. In this paper, we use transfer learning---learning on an ensemble of related tasks---to construct an informative prior on feature relevance. We assume that features themselves have meta-features that are predictive of their relevance to the prediction task, and model their relevance as a function of the meta-features using hyperparameters (called meta-priors). We present a convex optimization algorithm for simultaneously learning the meta-priors and feature weights from an ensemble of related prediction tasks which share a similar relevance structure. Our approach transfers the "meta-priors" among different tasks, which makes it possible to deal with settings where tasks have nonoverlapping features or the relevance of the features vary over the tasks. We show that learning feature relevance improves performance on two real data sets which illustrate such settings: (1) predicting ratings in a collaborative filtering task, and (2) distinguishing arguments of a verb in a sentence.

References

Argyriou, A., Evgeniou, T., & Pontil, M. (2006). Multi-task feature learning. Proceeding of NIPS. Cambridge, MA: MIT Press. Google ScholarDigital Library
Baxter, J. (1997). A bayesian/information theoretic model of learning to learn viamultiple task sampling. Mach. Learn., 28, 7--39. Google ScholarDigital Library
Baxter, J. (2000). Model for inductive learning. J. of Artificial Intelligence Research.Google Scholar
Caruana, R. (1997). Multitask learning. Machine Learning, 28, 41--75. Google ScholarDigital Library
Evgeniou, T., Micchelli, C., & Pontil, M. (2005). Learning multiple tasks with kernel methods. J. Mach. Learn. Res. Google ScholarDigital Library
Fink, M., Shwatz-Shalev, S., Singer, Y., & Ullman, S. (2006). Online multiclass learning by interclass hypothesis sharing. Proc. 23rd International Conference on Machine Learning. Google ScholarDigital Library
Gildea, D., & Jurafsky, D. (2002). Automatic labeling of semantic roles. Google ScholarDigital Library
Heskes, T. (2000). Empirical bayes for learning to learn. Proc. 17th International Conference on Machine Learning. Google ScholarDigital Library
Kaelbling, L. (2003). JMLR special issue on variable and feature selection.Google Scholar
Kingsbury, P., Palmer, M., & Marcus, M. (2002). Adding semantic annotation to the penn treebank. Proceedings of the Human Language Technology Conference (HLT'02).Google ScholarDigital Library
MacKay, D. (1992). Bayesian interpolation. Neural Computation, 4, 415--447. Google ScholarDigital Library
Marlin, B. (2004). Collaborative filtering: A machine learning perspective.Google Scholar
McCallum, A., Rosenfeld, R., Mitchell, T., & Ng, A. Y. (1998). Improving text classification by shrinkage in a hierarchy of classes.Google Scholar
McCullagh, P., & Nelder, J. (1989). Generalized linear models. London: Chapman and Hall.Google Scholar
Moschitti, A. (2004). A study on convolution kernels for shallow statistic parsing. ACL. Google ScholarDigital Library
Neal, R. (1995). Bayesian learning for neural networks. Doctoral dissertation. Adviser-Geoffrey Hinton. Google ScholarDigital Library
Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J. H., & Jurafsky, D. (2005). Support vector learning for semantic argument classification. Machine Learning, 60, 11--39. Google ScholarDigital Library
Raina, R., Ng, A., & Koller, D. (2006). Transfer learning by constructing informative priors. Proc. 21st International Conference on Machine Learning. Google ScholarDigital Library
Taskar, B., Wong, M., & Koller, D. (2003). Learning on the test data: Leveraging unseen features. Proc. 20th International Conference on Machine Learning.Google Scholar
Teh, Y., Seeger, M., & Jordan, M. (2005). Semiparameteric latent factor models. Workshop on Artificial Intelligence and Statistics 10.Google Scholar
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? Advances in Neural Information Processing Systems (pp. 640--646). The MIT Press.Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B.Google Scholar
Yu, K., Tresp, V., & Schwaighofer, A. (2005). Learning gaussian processes from multiple tasks. Google ScholarDigital Library
Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Statist. Soc. B, 68, 49--67.Google ScholarCross Ref
Zhang, J., Ghahramani, Z., & Yang, Y. (2005). Learning multiple related tasks using latent independent component analysis. Advances in Neural Information Processing Systems 17.Google Scholar

Learning a meta-level prior for feature relevance from multiple related tasks
1. Computing methodologies

Recommendations

Relevance feature mapping for content-based multimedia information retrieval

This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to ...
Read More
The effect of low-level image features on pseudo relevance feedback

Relevance feedback (RF) is a technique popularly used to improve the effectiveness of traditional content-based image retrieval systems. However, users must provide relevant and/or irrelevant images as feedback for their queries, which is a tedious ...
Read More
Improved AdaBoost-based image retrieval with relevance feedback via paired feature learning

Boost learning algorithm, such as AdaBoost, has been widely used in a variety of applications in multimedia and computer vision. Relevance feedback-based image retrieval has been formulated as a classification problem with a small number of training ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '07: Proceedings of the 24th international conference on Machine learning
June 2007
1233 pages
ISBN:9781595937933
DOI:10.1145/1273496
Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 107
  Total Citations
  View Citations
- 682
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning a meta-level prior for feature relevance from multiple related tasks

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Relevance feature mapping for content-based multimedia information retrieval

The effect of low-level image features on pseudo relevance feedback

Improved AdaBoost-based image retrieval with relevance feedback via paired feature learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Learning a meta-level prior for feature relevance from multiple related tasks

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Relevance feature mapping for content-based multimedia information retrieval

The effect of low-level image features on pseudo relevance feedback

Improved AdaBoost-based image retrieval with relevance feedback via paired feature learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media