ACM Home Page
Please provide us with feedback. Feedback
Online and batch learning of pseudo-metrics
Full text PdfPdf (193 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 94  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Shai Shalev-Shwartz  The Hebrew University
Yoram Singer  The Hebrew University
Andrew Y. Ng  Stanford University
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 51,   Citation Count: 4
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015376
What is a DOI?

ABSTRACT

We describe and analyze an online algorithm for supervised learning of pseudo-metrics. The algorithm receives pairs of instances and predicts their similarity according to a pseudo-metric. The pseudo-metrics we use are quadratic forms parameterized by positive semi-definite matrices. The core of the algorithm is an update rule that is based on successive projections onto the positive semi-definite cone and onto half-space constraints imposed by the examples. We describe an efficient procedure for performing these projections, derive a worst case mistake bound on the similarity predictions, and discuss a dual version of the algorithm in which it is simple to incorporate kernel operators. The online algorithm also serves as a building block for deriving a large-margin batch algorithm. We demonstrate the merits of the proposed approach by conducting experiments on MNIST dataset and on document filtering.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions in Information Theory, IT-13, 21--27.
 
3
Cox, T., & Cox, M. (1994). Multidimensional scaling. Chapman and Hall, London.
 
4
Crammer, K., Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2003). Online passive aggressive algorithms. Advances in Neural Information Processing Systems 16.
 
5
 
6
Golub, G., & Loan, C. V. (1989). Matrix computations. John Hopkins University Press.
 
7
 
8
 
9
MacQueen, J. (1965). On convergence of k-means and partitions with minimum average variance. Ann. Math. Statist.
 
10
 
11
12
 
13
 
14
 
15
Xing, E., Ng, A. Y., Jordan, M., & Russell, S. (2003). Distance metric learning, with application to clustering with side-information. Advances in Neural Information Processing Systems 15.

Collaborative Colleagues:
Shai Shalev-Shwartz: colleagues
Yoram Singer: colleagues
Andrew Y. Ng: colleagues