ACM Home Page
Please provide us with feedback. Feedback
Voting with a parameterized veto strategy: solving the KDD Cup 2006 problem by means of a classifier committee
Full text PdfPdf (613 KB)
Source ACM SIGKDD Explorations Newsletter archive
Volume 8 ,  Issue 2  (December 2006) table of contents
Pages: 53 - 62  
Year of Publication: 2006
ISSN:1931-0145
Authors
Domonkos Tikk  Budapest University of Technology and Economics, Hungary, Hungary
Zsolt T. Kardkovács  Budapest University of Technology and Economics, Hungary, Hungary
Ferenc P. Szidarovszky  Budapest University of Technology and Economics, Hungary, Hungary
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 57,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1233321.1233328
What is a DOI?

ABSTRACT

This paper presents our winner solution for the KDD Cup 2006 problem. It is based on the results of three different supervised learning techniques which are then combined in a classifier committee, and finally a single solution is obtained with a voting procedure. The voting procedure assigns weights to each member of the committee according to their average performance on a ten-fold cross-validation test and it also takes into account the confidence values returned by the three algorithms. The final decision of the committee is determined by means of a parameterized veto strategy, which takes into consideration the maximal allowed error rate beside the confidence values of the committee members. The solution presented here won Task 2 and became runner-up at Task 1 in the competition.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Blum. Empirical support for Winnow and weighted-majority based algorithms: results on a calendar scheduling domain. In Proc. of 12th Int. Conf. on Machine Learning, pages 64--72, San Francisco, CA, 1995. Morgan Kaufmann.
 
2
S.-B. Cho and J. Ryu. Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features. Proc. IEEE, 90:1744--1753, 2002.
3
 
4
I. Dagan, Y. Karov, and D. Roth. Mistake-driven learning in text categorization. In C. Cardie and R. Weischedel, editors, Proc. of the 2nd Conf. on Empirical Methods in Natural Language Processing (EMNLP 97), pages 55--63. Association for Computational Linguistics, Somerset, New Jersey, 1997.
 
5
P. A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice Hall, London, 1982.
 
6
 
7
A. A. Ghaibeh, S. Kuroyanagi, and A. Iwata. Efficient subspace learning using a large scale neural network Combnet-II. In Proc. of the 9th Int. Conf. on Neural Information Processing (ICONIP'02), volume 1, pages 447--451, Singapore, 2002.
 
8
A. R. Golding and D. Roth. Applying Winnow to context-sensitive spelling correction. In Proc. of 13th Int. Conf. on Machine Learning, pages 182--190, Bari, Italy, 1996. Morgan Kaufmann.
 
9
T. Hastie, R. Tibshirani, and J. H. Friedman. The elements of statistical learning: data mining, inference, and prediction. Springer, 2001.
10
 
11
J. Kittler. Feature set search algorithms. In C. H. Chen, editor, Pattern Recognition and Signal Processing, pages 41--60. Sijthoff & Noordhoff, Alphen aan den Rijn, The Netherlands, 1978.
 
12
M. Kugler, K. Aoki, S. Kuroyanagi, A. Iwata, and A. S. Nugroho. Feature subset selection for support vector machines using confident margin. In Proc. of the IEEE Int. Joint Conf. on Neural Networks (IJCNN'05), volume 2, pages 907--912, Montréal, Canada, 2005.
 
13
Y. H. Li and A. K. Jain. Classification of text documents. The Computer Journal, 41(8):537--546, 1998.
 
14
 
15
N. Littlestone. Comparing sereval linear-threshold learning algorithm on tasks involving superfluous attributes. In Proc. of 12th Int. Conf. on Machine Learning, pages 353--361, San Francisco, CA, 1995. Morgan Kaufmann.
 
16
T. Marill and D. Green. On the effectiveness of receptors in recognition systems. IEEE Trans. on Information Theory, 9:11--17, 1963.
 
17
 
18
19
20
21
 
22
W. Siedlecki and J. Sklansky. On automatic feature selection. Int. J. of Pattern Recognition and Artificial Intelligence, 2(2):197--220, 1988.
 
23
 
24
D. Tikk, G. Biró, and A. Törcsvári. A hierarchical online classifier for patent categorization. In H. A. do Prado and E. Ferneda, editors, Emerging Technologies of Text Mining: Techniques and Applications. Idea Group Inc., 2006. (in press).
 
25
D. Tikk, G. Biró, and J. D. Yang. A hierarchical text categorization approach and its application to FRT expansion. Australian Journal of Intelligent Information Processing Systems, 8(3):123--131, 2004.
 
26
D. Tikk, T. D. Gedeon, and K. W. Wong. A feature ranking algorithm for fuzzy modelling problems. In J. Casillas, O. Cordón, F. Herrera, and L. Magdalena, editors, Interpretability Issues in Fuzzy Modeling, number 128 in Studies in Fuzziness and Soft Computing, pages 176--192. Springer-Verlag, Heidelberg, 2003.
 
27
K. Tumer and J. Ghosh. Error correlation and error reduction in ensemble classifiers. Connection Sci., 8(3--4):385--403, 1996.
28
 
29
Collaborative Colleagues:
Domonkos Tikk: colleagues
Zsolt T. Kardkovács: colleagues
Ferenc P. Szidarovszky: colleagues