research-article

A study on interestingness measures for associative classifiers

Authors:
Mojdeh Jalali-Heravi

University of Alberta, Canada

University of Alberta, Canada
View Profile

,
Osmar R. Zaïane

University of Alberta, Canada

University of Alberta, Canada
View Profile

SAC '10: Proceedings of the 2010 ACM Symposium on Applied ComputingMarch 2010Pages 1039–1046https://doi.org/10.1145/1774088.1774306

Published:22 March 2010Publication History

SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing

Pages 1039–1046

ABSTRACT

Associative classification is a rule-based approach to classify data relying on association rule mining by discovering associations between a set of features and a class label. Support and confidence are the de-facto "interestingness measures" used for discovering relevant association rules. The support-confidence framework has also been used in most, if not all, associative classifiers. Although support and confidence are appropriate measures for building a strong model in many cases, they are still not the ideal measures and other measures could be better suited.

There are many other rule interestingness measures already used in machine learning, data mining and statistics. This work focuses on using 53 different objective measures for associative classification rules. A wide range of UCI datasets are used to study the impact of different "inter-estingness measures" on different phases of associative classifiers based on the number of rules generated and the accuracy obtained. The results show that there are interesting-ness measures that can significantly reduce the number of rules for almost all datasets while the accuracy of the model is hardly jeopardized or even improved. However, no single measure can be introduced as an obvious winner.

References

J. M. Adamo. Data mining for association rules and sequential patterns: sequential and parallel algorithms. Springer-Verlag, 2001. Google ScholarDigital Library
C. C. Aggarwal and PS. Yu. A new framework for itemset generation. In PODS: Proceedings of the 17th symposium on Principles of Database Systems, pages 18--24. ACM, 1998. Google ScholarDigital Library
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In The International Conference on Very Large Databases, pages 487--499, 1994. Google ScholarDigital Library
M. L. Antonie and O. R. Zaïane. Text document categorization by term association. In Proc. of the IEEE 2002 International Conference on Data Mining, pages 19--26, Maebashi City, Japan, 2002. Google ScholarDigital Library
B. Arunasalam and S. Chawla. Cccs: A top-down associative classifier for imbalanced class distribution. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pages 517--522. ACM, 2006. Google ScholarDigital Library
P. J. Azevedo and A. M. Jorge. Comparing rule measures for predictive association rules. In ECML '07: Proceedings of the 18th European conference on Machine Learning, pages 510--517, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarDigital Library
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In SIGMOD '97: Proceedings of the 1997 ACM SIGMOD international conference on Management of data, pages 265--276. ACM, 1997. Google ScholarDigital Library
W. Buntine. Graphical models for discovering knowledge. Advances in knowledge discovery and data mining, pages 59--82, 1996. Google ScholarDigital Library
S. Chiusano and P. Garza. Selection of high quality rules in associative classification. In C. Zhang Y. Zhao and L. Cao, editors, Post-Mining of Association RUles: Techniques for Effective Knowledge Extraction. Information Science Reference, Hershey, NY, USA, 2009. Google ScholarDigital Library
U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on AI, pages 1022--1027, 1993.Google Scholar
M. Gavrilov, D. Anguelov, P. Indyk, and R. Motwani. Mining the stock market: Which measure is best? In proceedings of the 6th ACM Int'l Conference on Knowledge Discovery and Data Mining, pages 487--496, 2000. Google ScholarDigital Library
L. Geng and H. J. Hamilton. Interestingness measures for data mining: A survey. ACM Comput. Surv., 38(3):9, 2006. Google ScholarDigital Library
M. Hahsler and K. Hornik. New probabilistic interest measures for association rules. Intell. Data Anal., 11(5):437--455, 2007. Google ScholarDigital Library
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 1--12. ACM, 2000. Google ScholarDigital Library
R. J. Hilderman, H. J. Hamilton, and B. Barber. Ranking the interestingness of summaries from data mining systems. In In Proceedings of the 12th Annual Florida Artificial Intelligence Research Symposium (FLAIRS'99, pages 100--106, 1999. Google ScholarDigital Library
Mojdeh Jalali-Heravi. A study on interestingness measures for associative classifiers. Master's thesis, University of Alberta, 2009.Google Scholar
Y. Kodratoff. Comparing machine learning and knowledge discovery in databases: An application to knowledge discovery in texts. In In: ECCAI summer, pages 1--21. Springer, 2000.Google Scholar
I. Kononenko. On biases in estimating multi-valued attributes. In in Proc. 14th Int. Joint Conf Artificial Intelligence, pages 1034--1040. Morgan Kaufmann, 1995. Google ScholarDigital Library
S. Lallich, O. Teytaud, and E. Prudhomme. Association rule interestingness: Measure and statistical validation. In Quality Measures in Data Mining, pages 251--275. Springer, 2007.Google ScholarCross Ref
Y. Lan, D. Janssens, G. Chen, and G. Wets. Improving associative classification by incorporating novel interestingness measures. In ICEBE '05: Proceedings of the IEEE International Conference on e-Business Engineering, pages 282--288, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library
P. Lenca, P. Meyer, B. Vaillant, and S. Lallich. A multicriteria decision aid for interestingness measure selection. Technical Report LUSSI-TR-2004-01-EN, LUSSI Department, GET/ENST, France, 2004.Google Scholar
P. Lenca, B. Vaillant, P. Meyer, and S. Lallich. Association rule interestingness measures: Experimental and theoretical studies. In Quality Measures in Data Mining, pages 51--76. Springer, 2007.Google ScholarCross Ref
W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class-association rules. In IEEE International Conference on Data Mining (ICDM'01), San Jose, California, November 29--December 2 2001. Google ScholarDigital Library
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In KDD, pages 80--86, 1998.Google ScholarDigital Library
J. A. Major and J. J. Mangano. Selecting among rules induced from a hurricane database. Journal of Intelligent Information systems, 4:39--52, 1995.Google ScholarCross Ref
K. McGarry. A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev., 20(1):39--61, 2005. Google ScholarDigital Library
K. McGarry and J. Malone. Analysis of rules discovered by the data mining process. In Applications and Science in Soft Computing Series: Advances in Soft Computing., pages 219--224. Springer, 2004.Google Scholar
M. Ohsaki, S. Kitaguchi, H. Yokoi, and T. Yamaguchi. Investigation of rule interestingness in medical data mining. In Active Mining, pages 174--189, 2003. Google ScholarDigital Library
G. Piatetsky-Shapiro. Discovery, analysis, and presentation of strong rules. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge, MA, 1991.Google Scholar
W. Romão, A. Freitas, and I. Gimenes. Discovering interesting knowledge from a science and technology database with a genetic algorithm. Appl. Soft Comput., 4(2):121--137, 2004.Google ScholarCross Ref
P. Tan and V. Kumar. Interestingness measures for association patterns: A perspective. Technical Report 00-036, Department of Computer Sciences, University of Minnesota, 2000.Google Scholar
P. Tan, V. Kumar, and J. Srivastava. Selecting the right interestingness measure for association patterns. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pages 32--41. ACM, 2002. Google ScholarDigital Library
P. Tan, V. Kumar, and J. Srivastava. Selecting the right objective measure for association analysis. Inf. Syst., 29(4):293--313, 2004. Google ScholarDigital Library
F. Verhein and S. Chawla. Using significant positively associated and relatively class correlated rules for associative classification of imbalanced datasets. In Proceedings of the Seventh IEEE International Conference on Data Mining (ICDM '07), pages 679--684, Los Alamitos, 2007. IEEE Computer Society Press. Google ScholarDigital Library
K. Y. Yeung and W. L. Ruzzo. Principal component analysis for clustering gene expression data. Bioinformatics, 17(9):763--774, 2001.Google ScholarCross Ref
M. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining, pages 283--296.Google ScholarDigital Library
Y. Zhao and G. Karypis. Criterion functions for document clustering: Experiments and analysis. Technical report, Department of Computer Science, University of Minnesota, 2002.Google Scholar

Index Terms

A study on interestingness measures for associative classifiers
1. Information systems
  1. Information systems applications

Recommendations

Interestingness measures for data mining: A survey

Interestingness measures play an important role in data mining, regardless of the kind of patterns being mined. These measures are intended for selecting and ranking patterns according to their potential interest to the user. Good measures also allow ...
Read More
Interestingness measures for association rules: Combination between lattice and hash tables

There are many methods which have been developed for improving the time of mining frequent itemsets. However, the time for generating association rules were not put in deep research. In reality, if a database contains many frequent itemsets (from ...
Read More
A New Interestingness Measure of Association Rules
WGEC '08: Proceedings of the 2008 Second International Conference on Genetic and Evolutionary Computing

Discovering association rules is one of the most important tasks in data mining. The classical model of association rules mining is support-confidence, the interestingness measure of which is the confidence measure. The classical Interestingness measure ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing
March 2010
2712 pages
ISBN:9781605586397
DOI:10.1145/1774088
Conference Chairs:
Sung Y. Shin
South Dakota State University
,
Sascha Ossowski
University Rey Juan Carlos, Spain
,
Michael Schumacher
University of Applied Sciences Western Switzerland, Switzerland
,
Program Chairs:
Mathew J. Palakal
Indiana University Purdue University
,
Chih-Cheng Hung
Southern Polytechnic State University
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 March 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
associative classifiers
interestingness measures
Qualifiers
- research-article
Conference

Acceptance Rates
SAC '10 Paper Acceptance Rate364of1,353submissions,27%Overall Acceptance Rate1,650of6,669submissions,25%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 484
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A study on interestingness measures for associative classifiers

SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Interestingness measures for data mining: A survey

Interestingness measures for association rules: Combination between lattice and hash tables

A New Interestingness Measure of Association Rules