ABSTRACT
Handling massive datasets is a difficult problem not only due to prohibitively large numbers of entries but in some cases also due to the very high dimensionality of the data. Often, severe feature selection is performed to limit the number of attributes to a manageable size, which unfortunately can lead to a loss of useful information. Feature space reduction may well be necessary for many stand-alone classifiers, but recent advances in the area of ensemble classifier techniques indicate that overall accurate classifier aggregates can be learned even if each individual classifier operates on incomplete "feature view" training data, i.e., such where certain input attributes are excluded. In fact, by using only small random subsets of features to build individual component classifiers, surprisingly accurate and robust models can be created. In this work we demonstrate how these types of architectures effectively reduce the feature space for submodels and groups of sub-models, which lends itself to efficient sequential and/or parallel implementations. Experiments with a randomized version of Adaboost are used to support our arguments, using the text classification task as an example.
- I. Aleksander and T. J. Stonham. A guide to pattern recognition using random-access memories, IEE Proceedings-E Computers and Digital Techniques, 2(1):29--40, 1979.]]Google ScholarCross Ref
- I. Aleksander, W. Thomas, and P. Bowden. WISARD, a radical new step forward in image recognition. Sensor Rev., 4(3):120--124, 1984.]]Google Scholar
- Y. Amit, G. Blanchard, and K. Wilder. Multiple randomized classifiers: MRCL. Technical Report 446, Depertment of Statistics, University of Chicago, 2000.]]Google Scholar
- C. Apté, F. Damerau, and S. M. Weiss. Automated learning of decision rules for text categorization. ACM Transactions on Information Systems, 12(3):233--251, 1994.]] Google ScholarDigital Library
- W. Bledsoe and I. Browning. Pattern recognition and reading by machine. In IRE Joint Computer Conference, pages 225--232, 1959.]]Google ScholarDigital Library
- L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996.]] Google ScholarCross Ref
- L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801--849, 1998.]]Google ScholarCross Ref
- L. Breiman. Random forests. Machine Learning, 24(2):5--32, 2001.]] Google ScholarDigital Library
- T. G. Dietterich. An experimental comparison of three methods of constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139--157, 2000.]] Google ScholarDigital Library
- C. Domingo and O. Watanabe. Scaling up a boosting-based learner via adaptive sampling. In Proceedings of the 2000 Pacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD-2000), pages 317--328, 2000.]] Google ScholarDigital Library
- Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Machine Learning Conference, pages 148--156, 1996.]]Google ScholarDigital Library
- J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 38(2):337--374, 2000.]]Google ScholarCross Ref
- D. Pavlov, J. Mao, and B. Dom. Scaling-up support vector machines using boosting algorithm. In Proceedings of the 2000 International Conference on Pattern Recognition, 2000.]]Google ScholarCross Ref
- J. A. Reichler and H. D. Harris. Parallel online continuous arcing and a new framework for wrapping parallel ensembles. In Proceedings of IJCAI 2001: International Joint Conference on Artificial Intelligence, Workshop on Wrappers for Performance Enhancement in Knowledge Discovery in Databases, pages 148--156, 2001.]]Google Scholar
- R. Rohwer and M. Morciniec. The theoretical and experimental status of the N-tuple classifier. Neural Networkas, 11(1):1--14, 1998.]] Google ScholarDigital Library
- R. E. Schapire and Y. Singer. BoosTexter: A boosting-based system for text categorization. Machine Learning, 39(2--3):135--168, 2000.]] Google ScholarDigital Library
- V. N. Vapnik. Statistical Learning Theory. John Wiley, New York, 1998.]]Google ScholarDigital Library
- Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of the 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 42--49, 1999.]] Google ScholarDigital Library
- Y. Yang and J. P. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML'97), pages 412--420, 1997.]] Google ScholarDigital Library
- C. Yu and D. B. Skillicorn. Parallelizing boosting and bagging. Technical Report 2001-442, Depertment of Computing and Information Science, Queen's University, Kingston, Canada, 2001.]]Google Scholar
Index Terms
- Efficient handling of high-dimensional feature spaces by randomized classifier ensembles
Recommendations
Classifier and feature set ensembles for web page classification
Web page classification is an important research direction on web mining. The abundant amount of data available on the web makes it essential to develop efficient and robust models for web mining tasks. Web page classification is the process of ...
Classifier ensembles: Select real-world applications
Broad classes of statistical classification algorithms have been developed and applied successfully to a wide range of real-world domains. In general, ensuring that the particular classification algorithm matches the properties of the data is crucial in ...
Hybridization of Base Classifiers of Random Subsample Ensembles for Enhanced Performance in High Dimensional Feature Spaces
ICMLA '10: Proceedings of the 2010 Ninth International Conference on Machine Learning and ApplicationsThis paper presents a simulation-based empirical study of the performance profile of random sub sample ensembles with a hybrid mix of base learner composition in high dimensional feature spaces. The performance of hybrid random sub sample ensemble that ...
Comments