skip to main content
10.1145/1066677.1066791acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

A "Go With the Winners" approach to finding frequent patterns

Published: 13 March 2005 Publication History

Abstract

In their seminal work on Go With the Winners (GWW) algorithms, D. Aldous and U. Vazirani [3] proved a sufficient condition for the number of particles needed for reaching the bottom of a tree with high probability via a GWW random walk. However, to use this result in practice would require knowledge of the entire search tree which is infeasible for most problems. In this paper we improve slightly on this situation by deriving a recurrence relation that provides an upper-bound for a tree's imbalance in terms of the imbalance between tree levels that are close to one another, provided that these latter imbalances can be measured with sufficient accuracy.We then turn our attention to the problem of finding both frequent and infrequent patterns in a database. One of the most widely used algorithms for finding frequent patterns in memory-resident databases is a randomized algorithm first proposed by Gunopulos et al. [12]. We show that such an algorithm is precisely one for which the GWW paradigm was designed to improve on. Experimental results using the Splice-junction Gene Sequences Database [4] are also provided and lend empirical evidence of the benefits of using GWW.

References

[1]
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Inkeri Verkamo. Fast discovery of association rules. In U. Fayyad etal, editor, Advances in Knowledge Discovery and Data Mining, pages 307--328, Menlo Park, CA, 1996.
[2]
Rakesh Agrawal, Tomasz Imielinski, and Arun N. Swami. Mining association rules between sets of items in large databases. In Peter Buneman and Sushil Jajodia, editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207--216, Washington, D.C., 26--28 1993.
[3]
David Aldous and Umesh V. Vazirani. "Go With the Winners" Algorithms. In IEEE Symposium on Foundations of Computer Science, pages 492--501, 1994.
[4]
C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998.
[5]
K. D. Boese, A. B. Kahng, and S. Muddu. A new adaptive multi-start technique for combinatorial global optimizations. Operations Research Letters, 16(2):101--113, 1994.
[6]
Endre Boros, Vladimir Gurvich, Leonid Khachiyan, and Kazuhisa Makino. An inequality limiting the number of maximal frequent sets, dimacs technical report 2000-37, rutgers university. Technical report, 2000.
[7]
Endre Boros, Vladimir Gurvich, Leonid Khachiyan, and Kazuhisa Makino. On the complexity of generating maximal frequent and minimal infrequent sets. In Symposium on Theoretical Aspects of Computer Science, pages 133--141, 2002.
[8]
J. Boyan and A. Moore. Using prediction to improve combinatorial optimization search. In Sixth International Workshop on Artificial Intelligence and Statistics, 1997.
[9]
C. Blum and A. Roli. Metaheuristics in Combinatorial Optimization: Overview and Conceptual Comparison Technical report TR/IRIDIA/2001-13, IRIDIA, Universit Libre de Bruxelles, Belgium, 2001.
[10]
T. A. Feo and M. G. C. Resende. Greedy randomized adaptive search procedures. Journal of Global Optimization, 6:109--133, 1995.
[11]
Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Sanjeev Saluja, Hannu Toivonen, and Ram Sewak Sharma. Discovering all most specific sentences. ACM Transactions on Database Systems, 28(2):140--174, 2003.
[12]
Dimitrios Gunopulos, Heikki Mannila, and Sanjeev Saluja. Discovering all most specific sentences by randomized algorithms. In ICDT, pages 215--229, 1997.
[13]
Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Efficient algorithms for discovering association rules. In Usama M. Fayyad and Ramasamy Uthurusamy, editors, AAAI Workshop on Knowledge Discovery in Databases (KDD-94), pages 181--192, Seattle, Washington, 1994. AAAI Press.
[14]
M. Matsumoto and T. Nishimura. Mersenne twister: A 623-dimensionally equidistributed uniform pseudorandom number generator. ACM Trans, on Modeling and Computer Simulation, 8(1):3--30, 1998.
[15]
Jr. R. J. Bayardo. Efficiently mining long patterns from databases. In Proceedings of the ACM SIGMOD, pages 85--93, Seattle, Washington, 1998.
[16]
V. J. Rayward-Smith, I. H. Osman, C. R. Reeves, and G. D. Smith. Modern Heuristic Search Methods. John Wiley and Sons, New York, 1996.
[17]
M. Yagiura and T. Ibaraki. On metaheuristic algorithms for combinatorial optimization problems. Transactions of the Institute of Electronics, Information and Communication Engineers, J83-D-1(1):3--25.

Cited By

View all
  • (2024)A Parallel “Go with the Winners” Algorithm for Some Scheduling ProblemsJournal of Applied and Industrial Mathematics10.1134/S199047892304001417:4(687-697)Online publication date: 16-Feb-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '05: Proceedings of the 2005 ACM symposium on Applied computing
March 2005
1814 pages
ISBN:1581139640
DOI:10.1145/1066677
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SAC05
Sponsor:
SAC05: The 2005 ACM Symposium on Applied Computing
March 13 - 17, 2005
New Mexico, Santa Fe

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Parallel “Go with the Winners” Algorithm for Some Scheduling ProblemsJournal of Applied and Industrial Mathematics10.1134/S199047892304001417:4(687-697)Online publication date: 16-Feb-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media