ACM Home Page
Please provide us with feedback. Feedback
Fast vertical mining using diffsets
Full text PdfPdf (228 KB)
Source Conference on Knowledge Discovery in Data archive
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Washington, D.C.
SESSION: Research track table of contents
Pages: 326 - 335  
Year of Publication: 2003
ISBN:1-58113-737-0
Authors
Mohammed J. Zaki  Rensselaer Polytechnic Institute, Troy, NY
Karam Gouda  Faculty of Science, Benha, Egypt
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 78,   Citation Count: 26
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/956750.956788
What is a DOI?

ABSTRACT

A number of vertical mining algorithms have been proposed recently for association mining, which have shown to be very effective and usually outperform horizontal approaches. The main advantage of the vertical format is support for fast frequency counting via intersection operations on transaction ids (tids) and automatic pruning of irrelevant data. The main problem with these approaches is when intermediate results of vertical tid lists become too large for memory, thus affecting the algorithm scalability.In this paper we present a novel vertical data representation called Diffset, that only keeps track of differences in the tids of a candidate pattern from its generating frequent patterns. We show that diffsets drastically cut down the size of memory required to store intermediate results. We show how diffsets, when incorporated into previous vertical mining methods, increase the performance significantly.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
5
 
6
 
7
 
8
 
9
 
10
11
 
12
 
13
14
 
15
 
16
J. Pei, J. Han, and R. Mao. Closet: An efficient algorithm for mining frequent closed itemsets. In SIGMOD Int'l Workshop on Data Mining and Knowledge Discovery, May 2000.
17
 
18
 
19
20
21
 
22
 
23
M. J. Zaki and C.-J. Hsiao. CHARM: An efficient algorithm for closed itemset mining. In 2nd SIAM Int'l Conf. on Data Mining, April 2002.
 
24
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In 3rd Intl. Conf. on Knowledge Discovery and Data Mining, August 1997.

CITED BY  26
 
 
 
 
 
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Mohammed J. Zaki: colleagues
Karam Gouda: colleagues

Peer to Peer - Readers of this Article have also read: