skip to main content
10.1145/2072298.2071927acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Detecting and identifying people in mobile videos

Published: 28 November 2011 Publication History

Abstract

In this paper, we propose a system capable of detecting and identifying people in videos captured by smart phones. We discuss the challenges to extend existing location aware multimedia applications from annotating static landmarks in distance to annotating dynamic people in close range with significant pose variations. We propose to use a hybrid video and RF tracking system to enable accurate observer and target localization, and extract part models comprised of Maximally Stable Color Regions for each target. The model can efficiently detect possible positions of targets in the video, which are then used as dynamic landmarks to calibrate the camera orientation. Finally, positions of all targets in the video are jointly estimated using both visual features and spatial constraints. Experiments show that our approach can locate identified targets in video with significantly higher accuracy than back-projection using camera orientation estimations from accelerometers and magnetometers.

References

[1]
M. Andriluka, S. Roth and B. Schiele, "Pictorial structures revisited: People detection and articulated pose estimation," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 2009, pp. 1014--1021.
[2]
P. Forssen, "Maximally stable colour regions for recognition and matching," in Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, 2007, pp. 1--8.
[3]
N. Gheissari, T. B. Sebastian and R. Hartley, "Person reidentification using spatiotemporal appearance," in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2006, pp. 1528--1535.
[4]
X. Liu, M. Corner and P. Shenoy, "SEVA: Sensor-enhanced video annotation," ACM Trans.Multimedia Comput.Commun.Appl., vol. 5, pp. 24:1--24:26, August, 2009.
[5]
K. Mikolajczyk and C. Schmid, "A performance evaluation of local descriptors," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27, pp. 1615--1630, 2005.
[6]
M. Park, J. Luo, R. T. Collins and Y. Liu, "Beyond GPS: Determining the camera viewing direction of a geotagged image," in Proceedings of the International Conference on Multimedia, Firenze, Italy, 2010, pp. 631--634.
[7]
X. Xiao, C. Xu and J. Wang, "Landmark image classification using 3D point clouds," in Proceedings of the International Conference on Multimedia, Firenze, Italy, 2010, pp. 719--722.
[8]
Xunyi Yu and A. Ganz, "Global identification of tracklets in video using long range identity sensors," in Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International Conference on, 2010, pp. 361--368.
[9]
Zhenjiang Li, Kunfeng Wang, Li Li and Fei-Yue Wang, "A review on vision-based pedestrian detection for intelligent vehicles," in Vehicular Electronics and Safety, 2006. ICVES 2006. IEEE International Conference on, 2006, pp. 57--62

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '11: Proceedings of the 19th ACM international conference on Multimedia
November 2011
944 pages
ISBN:9781450306164
DOI:10.1145/2072298
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 November 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. camera calibration
  2. feature fusion
  3. people detection

Qualifiers

  • Short-paper

Conference

MM '11
Sponsor:
MM '11: ACM Multimedia Conference
November 28 - December 1, 2011
Arizona, Scottsdale, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 170
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media