skip to main content
10.5555/1109557.1109685acmconferencesArticle/Chapter ViewAbstractPublication PagessodaConference Proceedingsconference-collections
Article

The space complexity of pass-efficient algorithms for clustering

Published:22 January 2006Publication History

ABSTRACT

We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. If our algorithm is allotted 2l passes, it will produce an approximation with error at most ε using Õ(k32/l) bits of memory, the most critical resource for streaming computation. We demonstrate that this tradeoff between passes and memory allotted is intrinsic to the problem and model of computation by proving lower bounds on the memory requirements of any l pass randomized algorithm that are nearly matched by our upper bounds. To the best of our knowledge, this is the first time nearly matching bounds have been proved for such an exponential tradeoff for randomized computation.In this problem, we are given a set of n points drawn randomly according to a mixture of k uniform distributions and wish to approximate the density function of the mixture. The points are placed in a datastream (possibly in adversarial order), which may only be read sequentially by the algorithm. We argue that this models, among others, the datastream produced by a national census of the incomes of all citizens.

References

References are not available

Index Terms

  1. The space complexity of pass-efficient algorithms for clustering

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SODA '06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
            January 2006
            1261 pages
            ISBN:0898716055

            Publisher

            Society for Industrial and Applied Mathematics

            United States

            Publication History

            • Published: 22 January 2006

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate411of1,322submissions,31%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader