skip to main content
10.1145/1816041.1816096acmconferencesArticle/Chapter ViewAbstractPublication PagescivrConference Proceedingsconference-collections
poster

An efficient method for face retrieval from large video datasets

Authors Info & Claims
Published:05 July 2010Publication History

ABSTRACT

The human face is one of the most important objects in videos since it provides rich information for spotting certain people of interest, such as government leaders in news video, or the hero in a movie, and is the basis for interpreting facts. Therefore, detecting and recognizing faces appearing in video are essential tasks of many video indexing and retrieval applications. Due to large variations in pose changes, illumination conditions, occlusions, hairstyles, and facial expressions, robust face matching has been a challenging problem. In addition, when the number of faces in the dataset is huge, e.g. tens of millions of faces, a scalable method for matching is needed. To this end, we propose an efficient method for face retrieval in large video datasets. In order to make the face retrieval robust, the faces of the same person appearing in individual shots are grouped into a single face track by using a reliable tracking method. The retrieval is done by computing the similarity between face tracks in the databases and the input face track. For each face track, we select one representative face and the similarity between two face tracks is the similarity between their two representative faces. The representative face is the mean face of a subset selected from the original face track. In this way, we can achieve high accuracy in retrieval while maintaining low computational cost. For experiments, we extracted approximately 20 million faces from 370 hours of TRECVID video, of which scale has never been addressed by the former attempts. The results evaluated on a subset consisting of manually annotated 457,320 faces show that the proposed method is effective and scalable.

References

  1. T. L. Berg, A. C. Berg, J. Edwards, and D. A. Forsyth. Who's in the picture? In Advances in Neural Information Processing Systems, 2004.Google ScholarGoogle Scholar
  2. M. Everingham, J. Sivic, and A. Zisserman. "Hello, My name is... Buffy" - automatic naming of charecters in tv video. In Proc. British Machine Vision Conf., pages 899--908, 2006.Google ScholarGoogle Scholar
  3. A. Hadid and M. Pietikainen. From still image to video-based face recognition: An experimental analysis. In Proc. Intl. Conf. on Automatic Face and Gesture Recognition, pages 813--818, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Indyk and R. Motwani. Approximate nearest neighbor - towards removing the curse of dimensionality. In Proc. 30th Symposium on Theory of Computing, pages 604--613, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D.-D. Le, S. Satoh, M. Houle, and D. Nguyen. Finding important people in large news video databases using multimodal and clustering analysis. In Proc. 2nd IEEE Intl. Workshop on Multimedia Databases and Data Management, pages 127--136, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. X. Liu and T. Chen. Video-based face recognition using adaptive hidden markov models. In Proc. Intl. Conf. on Computer Vision and Pattern Recognition, volume 1, pages 340--345, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Ngo, D.-D. Le, S. Satoh, and D. Duong. Robust face track finding in video using tracked points. In Proc. Intl. Conf. on Signal-Image Technology & Internet-Based Systems, pages 59--64, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(7):971--987, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Ramanan, S. Baker, and S. Kakade. Leveraging archival video for building face datasets. In Proc. Intl. Conf. on Computer Vision, volume 1, pages 1--8, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. S. Satoh and N. Katayama. An efficient implementation and evaluation of robust face sequence matching. In Proc. 10th Intl. Conf. on Image Analysis and Processing, pages 266--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Satoh, Y. Nakamura, and T. Kanade. Name-it: Naming and detecting faces in news videos. IEEE Multimedia, 6(1):22--35, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Sivic, M. Everingham, and A. Zisserman. Person spotting: Video shot retrieval for face sets. In Proc. Int. Conf. on Image and Video Retrieval, pages 226--236, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Sivic, M. Everingham, and A. Zisserman. "Who are you?" - learning person specific classifiers from video. In Proc. Intl. Conf. on Computer Vision and Pattern Recognition, pages 1145--1152, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and trecvid. In MIR '06: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pages 321--330, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. Intl. Conf. on Computer Vision and Pattern Recognition, volume 1, pages 511--518, 2001.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An efficient method for face retrieval from large video datasets

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval
        July 2010
        492 pages
        ISBN:9781450301176
        DOI:10.1145/1816041

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 July 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • poster

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader