research-article

Free Access

Visual object detection with deformable part models

Authors:
Pedro Felzenszwalb

Brown University

Brown University
View Profile

,
Ross Girshick

EECS, UC Berkeley

EECS, UC Berkeley
View Profile

,
David McAllester

Toyota Technological Institute at Chicago

Toyota Technological Institute at Chicago
View Profile

,
Deva Ramanan

UC Irvine

UC Irvine
View Profile

Authors Info & Claims

Communications of the ACM Volume 56 Issue 9September 2013pp 97–105https://doi.org/10.1145/2494532

Published:01 September 2013Publication History

Communications of the ACM

Abstract

We describe a state-of-the-art system for finding objects in cluttered images. Our system is based on deformable models that represent objects using local part templates and geometric constraints on the locations of parts. We reduce object detection to classification with latent variables. The latent variables introduce invariances that make it possible to detect objects with highly variable appearance. We use a generalization of support vector machines to incorporate latent information during training. This has led to a general framework for discriminative training of classifiers with latent variables. Discriminative training benefits from large training datasets. In practice we use an iterative algorithm that alternates between estimating latent values for positive examples and solving a large convex optimization problem. Practical optimization of this large convex problem can be done using active set techniques for adaptive subsampling of the training data.

References

Amit, Y., Trouve, A. POP: Patchwork of parts models for object recognition. Int. J. Comput. Vis. 75, 2 (2007), 267--282. Google ScholarDigital Library
Andrews, S., Tsochantaridis, I., Hofmann, T. Support vector machines for multiple-instance learning. In Advances in Neural Information Processing Systems (2003), volume 15.Google Scholar
Burl, M., Weber, M., Perona, P. A probabilistic approach to object recognition using local photometry and global geometry. In European Conference on Computer Vision (1998). Google ScholarDigital Library
Cootes, T., Edwards, G., Taylor, C. Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23, 6 (2001), 681--685. Google ScholarDigital Library
Coughlan, J., Yuille, A., English, C., Snow, D. Efficient deformable template detection and localization without user initialization. Comput. Vis. Image Understand. 78, 3 (2000), 303--319. Google ScholarDigital Library
Crandall, D., Felzenszwalb, P., Huttenlocher, D. Spatial priors for part-based recognition using statistical models. In IEEE Conference on Computer Vision and Pattern Recognition (2005). Google ScholarDigital Library
Dalal, N., Triggs, B. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition (2005). Google ScholarDigital Library
Desai, C., Ramanan, D., Fowlkes, C. Discriminative models of multi-class object layout. Int. J. Comput. Vis. 95, 1 (2011), 1--12. Google ScholarDigital Library
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A. The PASCAL Visual Object Classes Challenges. http://www.pascal-network.org/challenges/VOC/index.html. Google ScholarDigital Library
Felzenszwalb, P., Girshick, R., McAllester, D. Cascade object detection with deformable part models. In IEEE Computer Vision and Pattern Recognition (2010).Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D. Discriminatively trained deformable part models. http://people.cs.uchicago.edu/~pff/latent/.Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D. Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 9 (2010), 1627--1645. Google ScholarDigital Library
Felzenszwalb, P., Huttenlocher, D. Distance transforms of sampled functions. Technical Report 2004--1963, CIS Dept., Cornell University, 2004.Google Scholar
Felzenszwalb, P., Huttenlocher, D. Pictorial structures for object recognition. Int. J. Comput. Vis. 61, 1 (2005), 55--79. Google ScholarDigital Library
Felzenszwalb, P., McAllester, D. Object detection grammars. Technical Report TR-2010-02, CS Dept., University of Chicago, 2010.Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D. A discriminatively trained, multiscale, deformable part model. In IEEE Conference on Computer Vision and Pattern Recognition (2008).Google ScholarCross Ref
Fergus, R., Perona, P., Zisserman, A. Object class recognition by unsupervised scale-invariant learning. In IEEE Conference on Computer Vision and Pattern Recognition (2003).Google ScholarCross Ref
Fischler, M., Elschlager, R. The representation and matching of pictorial structures. IEEE Trans. Comput. C-22, 1 (1973), 67--92. Google ScholarDigital Library
Girshick, R., Felzenszwalb, P., McAllester, D. Object detection with grammar models. In Advances in Neural Information Processing Systems (2011), volume 24.Google Scholar
Grenander, U., Chow, Y., Keenan, D. HANDS: A Pattern-Theoretic Study of Biological Shapes, Springer-Verlag, 1991. Google ScholarDigital Library
Huttenlocher, D., Klanderman, G., Rucklidge, W. Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15, 9 (1993), 850--863. Google ScholarDigital Library
Lamdan, Y. Wolfson, H. Geometric hashing: A general and efficient model-based recognition scheme. In IEEE International Conference on Computer Vision (1988).Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.Google ScholarCross Ref
Lowe, D. Three-dimensional object recognition from single two-dimensional images. Artif. intell. 31, 3 (1987), 355--395. Google ScholarDigital Library
Marr, D., Nishihara, H. Representation and recognition of the spatial organization of three-dimensional shapes. Proc. Roy. Soc. Lond. B Biol. Sci. 200, 1140 (1978), 269--294.Google Scholar
Mundy, J., Zisserman, A., et al. Geometric Invariance in Computer Vision, volume 92, MIT press, Cambridge, MA, 1992. Google ScholarDigital Library
Murase, H., Nayar, S. Visual learning and recognition of 3-d objects from appearance. Int. J. Comput. Vis. 14, 1 (1995), 5--24. Google ScholarDigital Library
Schneiderman, H., Kanade, T. A statistical method for 3D object detection applied to faces and cars. In IEEE Conference on Computer Vision and Pattern Recognition (2000).Google ScholarCross Ref
Sung, K.K., Poggio, T. Example-based learning for view-based human face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1 (1998), 39--51. Google ScholarDigital Library
Viola, P., Jones, M. Robust real-time face detection. Int. J. Comput. Vis. 57, 2 (2004), 137--154. Google ScholarDigital Library
Weber, M., Welling, M., Perona, P. Towards automatic discovery of object categories. In IEEE Conference on Computer Vision and Pattern Recognition (2000).Google ScholarCross Ref
Yang, Y., Ramanan, D. Articulated pose estimation using flexible mixtures of parts. In IEEE Conference on Computer Vision and Pattern Recognition (2011). Google ScholarDigital Library
Yuille, A., Hallinan, P., Cohen, D. Feature extraction from faces using deformable templates. Int. J. Comput. Vis. 8, 2 (1992), 99--111. Google ScholarDigital Library
Zhu, X., Ramanan, D. Face detection, pose estimation, and landmark localization in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (2012). Google ScholarDigital Library

Index Terms

Visual object detection with deformable part models
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
2. Information systems
  1. Data management systems
    1. Database management system engines
  2. Information retrieval
    1. Document representation

Recommendations

Object Detection with Discriminatively Trained Part-Based Models

We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges. While ...
Read More
Scene recognition and weakly supervised object localization with deformable part-based models
ICCV '11: Proceedings of the 2011 International Conference on Computer Vision

Weakly supervised discovery of common visual structure in highly variable, cluttered images is a key problem in recognition. We address this problem using deformable part-based models (DPM's) with latent SVM training [6]. These models have been ...
Read More
Object detection using strongly-supervised deformable part models
ECCV'12: Proceedings of the 12th European conference on Computer Vision - Volume Part I

Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization during training due to the optimization of non-convex cost function. This paper investigates limitations of such an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Communications of the ACM Volume 56, Issue 9
September 2013
97 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2500468
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Popular
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 7,012
  Total Downloads
- Downloads (Last 12 months)240
- Downloads (Last 6 weeks)60
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Visual object detection with deformable part models

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Object Detection with Discriminatively Trained Part-Based Models

Scene recognition and weakly supervised object localization with deformable part-based models

Object detection using strongly-supervised deformable part models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Visual object detection with deformable part models

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Object Detection with Discriminatively Trained Part-Based Models

Scene recognition and weakly supervised object localization with deformable part-based models

Object detection using strongly-supervised deformable part models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media