skip to main content
10.1145/1250790.1250884acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
Article

Sampling-based dimension reduction for subspace approximation

Published: 11 June 2007 Publication History

Abstract

We give a randomized bi-criteria algorithm for the problem of finding a k-dimensional subspace that minimizesthe Lp-error for given points, i.e., p-th root of the sum of p-th powers of distances to given points,for any p ≥ 1. Our algorithm runs in time Õ (mn · pk3 (k/ε)2p) andproduces a subset of size Õ (pk2 (k/ε)2p) from the given points such that, withhigh probability, the span of these points gives a (1+ε)-approximation to the optimal k-dimensionalsubspace. We also show a dimension reduction type of result for this problem where we can efficiently find asubset of size Õ (pk2(p+1) + (k/ε)p+2) such that, with high probability, theirspan contains a k-dimensional subspace that gives (1+ε)-approximation to the optimum. We prove similarresults for the corresponding projective clustering problem where we need to find multiple k-dimensional subspaces.

References

[1]
D. Achlioptas, F. McSherry. Fast Computation of Low Rank Approximations. Proc. of the 33rd ACM Symposium on Theory of Computing (STOC), 2001.
[2]
P. Drineas, R. Kannan, M. Mahoney. Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix. Yale University Technical Report, YALEU/DCS/TR-1270, 2004.
[3]
P. Drineas, M. Mahoney, S. Muthukrishnan. Polynomial time algorithm for column-row based relative error low-rank matrix approximation. DIMACS Technical Report 2006-04, 2006.
[4]
A. Deshpande, L. Rademacher, S. Vempala, G. Wang. Matrix Approximation and Projective Clustering via Volume Sampling. Proc. of the 17th ACM-SIAM Symposium on Discrete Algorithms (SODA), 2006.
[5]
A. Deshpande, S. Vempala. Adaptive Sampling and Fast Low-Rank Matrix Approximation. Proc. of 10th International Workshop on Randomization and Computation (RANDOM), 2006.
[6]
D. Feldman, A. Fiat, and M. Sharir. Coresets for weighted facilities and their applications. Proc. of IEEE Symposium on Foundations of Computer Science (FOCS), 2006.
[7]
A. Frieze, R. Kannan, S. Vempala. Fast Monte-Carlo Algorithms for Finding Low-Rank Approximations. Proc. of IEEE Symposium on Foundations of Computer Science (FOCS), 1998.
[8]
S. Har-Peled. How to get close to the median shape. Proc. of ACM Symposium on Computational Geometry (SOCG), 2006.
[9]
S. Har-Peled. Low-Rank Matrix Approximation in Linear Time. manuscript.
[10]
S. Har-Peled and K. R. Varadarajan. Projective clustering in high dimensions using core-sets. Proc. of ACM Symposium on Computational Geometry (SOCG), 2002, pp. 312--318.
[11]
S. Har-Peled and K. Varadarajan. High-Dimensional Shape Fitting in Linear Time. Discrete & Computational Geometry, 32(2), 2004, pp. 269--288.
[12]
T. Sarlos. Improved Approximation Algorithms for Large Matrices via Random Projections. Proc. of IEEE Symposium on Foundations of Computer Science (FOCS), 2006.
[13]
N. D. Shyamalkumar and K. Varadarajan. Efficient Subspace Approximation Algorithms. Proc. of ACM-SIAM Symposium on Discrete Algorithms (SODA), 2007.

Cited By

View all
  • (2024)Coresets for multiple ℓp regressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694250(53202-53233)Online publication date: 21-Jul-2024
  • (2024)High-dimensional geometric streaming for nearly low rank dataProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692572(12588-12605)Online publication date: 21-Jul-2024
  • (2023)On generalization bounds for projective clusteringProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669262(71723-71754)Online publication date: 10-Dec-2023
  • Show More Cited By

Index Terms

  1. Sampling-based dimension reduction for subspace approximation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    STOC '07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
    June 2007
    734 pages
    ISBN:9781595936318
    DOI:10.1145/1250790
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. subspace approximation

    Qualifiers

    • Article

    Conference

    STOC07
    Sponsor:
    STOC07: Symposium on Theory of Computing
    June 11 - 13, 2007
    California, San Diego, USA

    Acceptance Rates

    Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

    Upcoming Conference

    STOC '25
    57th Annual ACM Symposium on Theory of Computing (STOC 2025)
    June 23 - 27, 2025
    Prague , Czech Republic

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Coresets for multiple ℓp regressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694250(53202-53233)Online publication date: 21-Jul-2024
    • (2024)High-dimensional geometric streaming for nearly low rank dataProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692572(12588-12605)Online publication date: 21-Jul-2024
    • (2023)On generalization bounds for projective clusteringProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669262(71723-71754)Online publication date: 10-Dec-2023
    • (2023)One-Pass Additive-Error Subset Selection for $$\ell _{p}$$ Subspace Approximation and (k, p)-ClusteringAlgorithmica10.1007/s00453-023-01124-085:10(3144-3167)Online publication date: 11-May-2023
    • (2021)Optimal ℓ1 column subset selection and a fast PTAS for low rank approximationProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458098(560-578)Online publication date: 10-Jan-2021
    • (2020)Coresets for clustering in Euclidean spaces: importance sampling is nearly optimalProceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing10.1145/3357713.3384296(1416-1429)Online publication date: 22-Jun-2020
    • (2020)Subspace Approximation with OutliersComputing and Combinatorics10.1007/978-3-030-58150-3_1(1-13)Online publication date: 29-Aug-2020
    • (2019)Relative error tensor low rank approximationProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310607(2772-2789)Online publication date: 6-Jan-2019
    • (2019)A PTAS for ℓ-low rank approximationProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310482(747-766)Online publication date: 6-Jan-2019
    • (2018)Robust subspace approximation in a streamProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327546.3327728(10706-10716)Online publication date: 3-Dec-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media