research-article

Maximum normalized spacing for efficient visual clustering

Authors:
Zhi-Gang Fan

SHARP Electronics (Shanghai) Co. Ltd, Shanghai, China

SHARP Electronics (Shanghai) Co. Ltd, Shanghai, China
View Profile

,
Yadong Wu

SHARP Electronics (Shanghai) Co. Ltd, Shanghai, China

SHARP Electronics (Shanghai) Co. Ltd, Shanghai, China
View Profile

,
Bo Wu

SHARP Electronics (Shanghai) Co. Ltd, Shanghai, China

SHARP Electronics (Shanghai) Co. Ltd, Shanghai, China
View Profile

CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge managementOctober 2010Pages 409–418https://doi.org/10.1145/1871437.1871492

Published:26 October 2010Publication History

CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Pages 409–418

ABSTRACT

In this paper, for efficient clustering of visual image data that have arbitrary mixture distributions, we propose a simple distance metric learning method called Maximum Normalized Spacing (MNS) which is a generalized principle based on Maximum Spacing [12] and Minimum Spanning Tree (MST). The proposed Normalized Spacing (NS) can be viewed as a kind of adaptive distance metric for contextual dissimilarity measure which takes into account the local distribution of the data vectors. Image clustering is a difficult task because there are multiple nonlinear manifolds embedded in the data space. Many of the existing clustering methods often fail to learn the whole structure of the multiple manifolds and they are usually not very effective. Combining both the internal and external statistics of clusters to capture the density structure of manifolds, MNS is capable of efficient and effective solving the clustering problem for the complex multi-manifold datasets in arbitrary metric spaces. We apply this MNS method into the practical problem of multi-view image clustering and obtain good results which are helpful for image browsing systems. Using the COIL-20 [19] and COIL-100 [18] multi-view image databases, our experimental results demonstrate the effectiveness of the proposed MNS clustering method and this clustering method is more efficient than the traditional clustering methods.

References

D. Cai, X. He, Z. Li, W. Y. Ma, and J. R. Wen. Hierarchical clustering of www image search results using visual, textual and link information. ACM Multimedia 2004, pages 952--959, 2004. Google ScholarDigital Library
B. Chazelle. A minimum spanning tree algorithm with inverse-ackermann type complexity. Journal of the ACM, 47:1028--1047, 2000. Google ScholarDigital Library
Y. Chen, J. Z. Wang, and R. Krovetz. Clue: cluster-based retrieval of images by unsupervised learning. IEEE Transactions on Image Processing, 14:1187--1201, 2005. Google ScholarDigital Library
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. OSDI 2004, pages 137--150, 2004.Google ScholarDigital Library
C. Ding, X. He, H. Zha, M. Gu, and H. D. Simon. A min-max cut algorithm for graph partitioning and data clustering. ICDM 2001, pages 107--114, 2001. Google ScholarDigital Library
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern classification. John Wiley & Sons Inc., 2nd edition, 2001. Google ScholarDigital Library
Z. G. Fan, J. Li, B. Wu, and Y. Wu. Local patterns constrained image histograms for image retrieval. ICIP 2008, pages 941--944, 2008.Google Scholar
P. F. Felzenszwalb and D. P. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, 59:167--181, 2004. Google ScholarDigital Library
M. Filippone, F. Camastra, F. Masulli, and S. Rovetta. A survey of kernel and spectral methods for clustering. Pattern Recognition, 41:176--190, 2008. Google ScholarDigital Library
B. J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 315:972--976, 2007.Google Scholar
A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Comput. Surv., 31:264--323, 1999. Google ScholarDigital Library
J. Kleinberg and E. Tardos. Algorithm design. Addison Wesley, 1st edition, 2005. Google ScholarDigital Library
A. Levin, D. Lischinski, and Y. Weiss. A closed form solution to natural image matting. CVPR 2006, 1:61--68, 2006. Google ScholarDigital Library
J. Li and J. Z. Wang. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25:1075--1088, 2003. Google ScholarDigital Library
J. Lim, J. Ho, M. Yang, K. Lee, and D. Kriegman. Image clustering with metric, local linear structure and affine symmetry. ECCV 2004, 1:456--468, 2004.Google ScholarCross Ref
Y. Ma, H. Derksen, W. Hong, and J. Wright. Segmentation of multivariate mixed data via lossy data coding and compression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29:1546--1562, 2007. Google ScholarDigital Library
A. McCallum, K. Nigam, and L. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. KDD 2000, pages 169--178, 2000. Google ScholarDigital Library
S. A. Nene, S. K. Nayar, and H. Murase. Columbia object image library (coil-100). Technical Report CUCS-006-96, 1996.Google Scholar
S. A. Nene, S. K. Nayar, and H. Murase. Columbia object image library (coil-20). Technical Report CUCS-005-96, 1996.Google Scholar
A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: analysis and an algorithm. NIPS 2001, pages 849--856, 2001.Google ScholarDigital Library
P. Pudil, J. Novovicova, and J. Kittler. Floating search methods in feature selection. Pattern Recognition Letters, 15:1119--1125, 1994. Google ScholarDigital Library
S. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323--2326, 2000.Google Scholar
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:888--905, 2000. Google ScholarDigital Library
R. Souvenir and R. Pless. Manifold clustering. ICCV 2005,1:648--653, 2005. Google ScholarDigital Library
J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319--2323, 2000.Google Scholar
K. Q. Weinberger and L. K. Saul. Unsupervised learning of image manifolds by semidefinite programming. International Journal of Computer Vision, 70:77--90, 2006. Google ScholarDigital Library
Y. Weiss. Segmentation using eigenvectors: a unifying view. ICCV 1999, 1:975--982, 1999. Google ScholarDigital Library
R. Xu and D. Wunsch. Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16:645--678, 2005. Google ScholarDigital Library
Y. Xu, V. Olman, and D. Xu. Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees. Bioinformatics, 18:536--545, 2002.Google Scholar
D. Yankov and E. Keogh. Manifold clustering of shapes. ICDM 2006, 1:1167--1171, 2006. Google ScholarDigital Library
C. T. Zahn. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers, 20:68--86, 1971. Google ScholarDigital Library
L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. NIPS 2004, pages 1601--1608, 2004.Google Scholar
S. Zhang, C. Shi, Z. Zhang, and Z. Shi. A global geometric approach for image clustering. ICPR 2006, 4:960--963, 2006. Google ScholarDigital Library
Y. Zhao and G. Karypis. Evaluation of hierarchical clustering algorithms for document datasets. CIKM 2002, pages 515--524, 2002. Google ScholarDigital Library

Index Terms

Maximum normalized spacing for efficient visual clustering
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory

Recommendations

Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology

This article describes how data is relevant and if it can be organized, linked with other data and grouped into a cluster. Clustering is the process of organizing a given set of objects into a set of disjoint groups called clusters. There are a number ...
Read More
Data clustering using bacterial foraging optimization

Clustering divides data into meaningful or useful groups (clusters) without any prior knowledge. It is a key technique in data mining and has become an important issue in many fields. This article presents a new clustering algorithm based on the ...
Read More
An efficient hybrid clustering algorithm for molecular sequences classification
ACM-SE 44: Proceedings of the 44th annual Southeast regional conference

The k-means clustering and hierarchical agglomerative clustering algorithms are two popular methods to partition data into groups. The k-means clustering algorithm heavily favors spherical clusters and does not deal with noise adequately. To overcome ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
October 2010
2036 pages
ISBN:9781450300995
DOI:10.1145/1871437
General Chair:
Jimmy Huang
York University, Canada
,
Program Chairs:
Nick Koudas
University of Toronto, Canada
,
Gareth Jones
Dublin City University, Ireland
,
Xindong Wu
University of Vermont, USA
,
Kevyn Collins-Thompson
Microsoft Research, USA
,
Aijun An
York University, Canada
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 October 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data clustering
data mining
distance metric learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 251
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Maximum normalized spacing for efficient visual clustering

CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology

Data clustering using bacterial foraging optimization

An efficient hybrid clustering algorithm for molecular sequences classification