Article

Robust feature induction for support vector machines

Authors:

Huan LiuAuthors Info & Claims

ICML '04: Proceedings of the twenty-first international conference on Machine learning

Page 57

https://doi.org/10.1145/1015330.1015370

Published: 04 July 2004 Publication History

Abstract

The goal of feature induction is to automatically create nonlinear combinations of existing features as additional input features to improve classification accuracy. Typically, nonlinear features are introduced into a support vector machine (SVM) through a nonlinear kernel function. One disadvantage of such an approach is that the feature space induced by a kernel function is usually of high dimension and therefore will substantially increase the chance of over-fitting the training data. Another disadvantage is that nonlinear features are induced implicitly and therefore are difficult for people to understand which induced features are critical to the classification performance. In this paper, we propose a boosting-style algorithm that can explicitly induces important nonlinear features for SVMs. We present empirical studies with discussion to show that this approach is effective in improving classification accuracy for SVMs. The comparison with an SVM model using nonlinear kernels also indicates that this approach is effective and robust, particularly when the number of training data is small.

References

[1]

Joachims, T. (2001). A Statistical Learning Model of Text Classification for SVMs. In Proceeding of 24th ACM International Conference on Research and Development in Information Retrieval.]]

Digital Library

[2]

Teytaud, O. and Sarrut, O. (2001). Kernel-based Image Classification, Lecture Notes in Computer Science, http://www/citeseer.nj.nec.com/496638.html.]]

Digital Library

[3]

Phillips, P. J. (1998). Support Vector Machines Applied to Face Recognition. In Advances in Neural Information Processing Systems 11, page 803.]]

Digital Library

[4]

Opitz, D., & Macline, R. (1999). Popular Ensemble methods: An Empirical Study. Journal of AI Research pp. 169--198.]]

[5]

Jin, R., Liu, Y., Si, L., James, C., & A. G. Hauptmann (2003). A New Boosting Algorithm Using Input-Dependent Regularizer. In Proceedings of 20th International Conference on Machine Learning.]]

[6]

Dietterich, T. G. (2000). An Experimental Comparison of Three Methods for constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning, 40, 139--157.]]

Digital Library

[7]

Grove, A. J., & Schuurmans, D. (1998). Boosting in the Limit: Maximizing the Margin of Learned Ensembles. In Proceedings of the Fifteenth National Conference on Artificial Intelligence pp. 692--699.]]

Digital Library

[8]

Ratsch, G., Onoda, T., & Muller, K. (2000). Soft Margins for Adaboost. Machine Learning, 42, 287--320.]]

Digital Library

[9]

Rennie, J D. M. & Jaakkola, T. (2002). Automatic Feature Induction for Text Classification, MIT Artificial Intelligence Laboratory Abstract Book.]]

[10]

Della Pietra, S., Della Pietra, V. J., & Lafferty, J. D. (1997). Inducing Features of Random Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 380--393.]]

Digital Library

[11]

Rosenfeld, R., Wasserman, L., Cai, C., & Zhu, X. J. (1999). Interactive Feature Induction and Logistic Regression for Whole Sentence Exponential Language Models. In Proc. IEEE workshop on Automatic Speech Recognition and Understanding, Keystone, Colorado.]]

[12]

Burges, C. J. C. (1998). A Tutorial on Support Vector Machine for Pattern Recognition, Knowledge Discovery and Data Mining, 2(2).]]

Digital Library

[13]

Witten, I. H. & Frank, E. (1999). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann.]]

Digital Library

[14]

Schapire, R. E. & Singer, Y. (1999). Improved Boosting Algorithms using Confidence-rated Predictions, Machine Learning 37(3): 291--336.]]

Digital Library

[15]

Quinlan, R. (1993). C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, San Mateo, CA.]]

Digital Library

[16]

TRECVID (2003). http://www-nlpir.nist.gov/projects/tv2003/tv2003.html.]]

[17]

Zhang, J., Jin, R., Yang, Y. & Hauptmann, A. G. (2003). Modified Logistic Regression: An Approximation to SVM and its Applications in Large-Scale Text Categorization, In Proc. of International Conference on Machine Learning (ICML 2003).]]

[18]

McCallum, A. (2003). Efficiently Inducing Features of Conditional Random Fields. In Proc. Of 19th Uncertainty in Artificial Intelligence (UAI 2003).]]

Digital Library

Cited By

Fahim NUtsha MKarmaker RUllah MFarid D(2023)Decision Tree using Feature Grouping2023 26th International Conference on Computer and Information Technology (ICCIT)10.1109/ICCIT60459.2023.10441110(1-5)Online publication date: 13-Dec-2023
https://doi.org/10.1109/ICCIT60459.2023.10441110
Campos DMatos SOliveira J(2013)Current Methodologies for Biomedical Named Entity RecognitionBiological Knowledge Discovery Handbook10.1002/9781118617151.ch37(839-868)Online publication date: 27-Dec-2013
https://doi.org/10.1002/9781118617151.ch37
Zhou ZLiu H(2012)ReferencesSpectral Feature Selection for Data Mining10.1201/b11426-8(171-189)Online publication date: 6-Jan-2012
https://doi.org/10.1201/b11426-8
Show More Cited By

Robust feature induction for support vector machines
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches

Recommendations

Wavelet twin support vector machines based on glowworm swarm optimization

Twin support vector machine is a machine learning algorithm developing from standard support vector machine. The performance of twin support vector machine is always better than support vector machine on datasets that have cross regions. Recently ...
Feature extraction using support vector machines
ICONIP'10: Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II

We discuss feature extraction by support vector machines (SVMs). Because the coefficient vector of the hyperplane is orthogonal to the hyperplane, the vector works as a projection vector. To obtain more projection vectors that are orthogonal to the ...
Multitask centroid twin support vector machines

Twin support vector machines are a recently proposed learning method for binary classification. They learn two hyperplanes rather than one as in conventional support vector machines and often bring performance improvements. However, an inherent shortage ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '04: Proceedings of the twenty-first international conference on Machine learning

July 2004

934 pages

ISBN:1581138385

DOI:10.1145/1015330

Conference Chair:
Carla Brodley
Purdue University/Tufts University

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
591
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fahim NUtsha MKarmaker RUllah MFarid D(2023)Decision Tree using Feature Grouping2023 26th International Conference on Computer and Information Technology (ICCIT)10.1109/ICCIT60459.2023.10441110(1-5)Online publication date: 13-Dec-2023
https://doi.org/10.1109/ICCIT60459.2023.10441110
Campos DMatos SOliveira J(2013)Current Methodologies for Biomedical Named Entity RecognitionBiological Knowledge Discovery Handbook10.1002/9781118617151.ch37(839-868)Online publication date: 27-Dec-2013
https://doi.org/10.1002/9781118617151.ch37
Zhou ZLiu H(2012)ReferencesSpectral Feature Selection for Data Mining10.1201/b11426-8(171-189)Online publication date: 6-Jan-2012
https://doi.org/10.1201/b11426-8
Vens CCosta F(2011)Random Forest Based Feature InductionProceedings of the 2011 IEEE 11th International Conference on Data Mining10.1109/ICDM.2011.121(744-753)Online publication date: 11-Dec-2011
https://dl.acm.org/doi/10.1109/ICDM.2011.121
Pang XFeng YJiang W(2007)An Improved Document Classification Approach with Maximum Entropy and Entropy Feature Selection2007 International Conference on Machine Learning and Cybernetics10.1109/ICMLC.2007.4370829(3911-3915)Online publication date: Aug-2007
https://doi.org/10.1109/ICMLC.2007.4370829
Pang XFeng Y(2006)An Improved Economic Early Warning Based on Rough Set and Support Vector Machine2006 International Conference on Machine Learning and Cybernetics10.1109/ICMLC.2006.258777(2444-2449)Online publication date: Aug-2006
https://doi.org/10.1109/ICMLC.2006.258777

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten