research-article

Automatic triage for a photo series

Authors:
Huiwen Chang

Princeton University

Princeton University
View Profile

,
Fisher Yu

Princeton University

Princeton University
View Profile

,
Jue Wang

Adobe Research

Adobe Research
View Profile

,
Douglas Ashley

Princeton University

Princeton University
View Profile

,
Adam Finkelstein

Princeton University

Princeton University
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 35 Issue 4Article No.: 148pp 1–10https://doi.org/10.1145/2897824.2925908

Published:11 July 2016Publication History

ACM Transactions on Graphics

Abstract

People often take a series of nearly redundant pictures to capture a moment or scene. However, selecting photos to keep or share from a large collection is a painful chore. To address this problem, we seek a relative quality measure within a series of photos taken of the same scene, which can be used for automatic photo triage. Towards this end, we gather a large dataset comprised of photo series distilled from personal photo albums. The dataset contains 15, 545 unedited photos organized in 5,953 series. By augmenting this dataset with ground truth human preferences among photos within each series, we establish a benchmark for measuring the effectiveness of algorithmic models of how people select photos. We introduce several new approaches for modeling human preference based on machine learning. We also describe applications for the dataset and predictor, including a smart album viewer, automatic photo enhancement, and providing overviews of video clips.

Supplemental Material

a148.mp4

mp4

218.4 MB

Download

References

Bell, S., and Bala, K. 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics (TOG) 34, 4, 98. Google ScholarDigital Library
Bhattacharya, S., Sukthankar, R., and Shah, M. 2010. A framework for photo-quality assessment and enhancement based on visual aesthetics. In Proceedings of the international conference on Multimedia, ACM, 271--280. Google ScholarDigital Library
Breiman, L. 2001. Random forests. Machine learning 45, 1, 5--32. Google ScholarDigital Library
Bromley, J., Bentz, J. W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Säckinger, E., and Shah, R. 1993. Signature verification using a "siamese" time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence 7, 04, 669--688.Google ScholarCross Ref
Bychkovsky, V., Paris, S., Chan, E., and Durand, F. 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 97--104. Google ScholarDigital Library
Cao, X., Wei, Y., Wen, F., and Sun, J. 2014. Face alignment by explicit shape regression. International Journal of Computer Vision 107, 2, 177--190. Google ScholarDigital Library
Chopra, S., Hadsell, R., and LeCun, Y. 2005. Learning a similarity metric discriminatively, with application to face verification. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, IEEE, 539--546. Google ScholarDigital Library
Cootes, T. F., Edwards, G. J., and Taylor, C. J. 1998. Active appearance models. In Computer Vision?ECCV?98. Springer, 484--498. Google ScholarDigital Library
Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2006. Studying aesthetics in photographic images using a computational approach. In Computer Vision--ECCV 2006. Springer, 288--301. Google ScholarDigital Library
Dhar, S., Ordonez, V., and Berg, T. L. 2011. High level de-scribable attributes for predicting aesthetics and interestingness. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 1657--1664. Google ScholarDigital Library
Drucker, S., Wong, C., Roseway, A., Glenner, S., and De Mar, S. 2003. Photo-triage: Rapidly annotating your digital photographs. Tech. rep., Microsoft Research Technical Report, MSR-TR-2003-99.Google Scholar
Girshick, R., Donahue, J., Darrell, T., and Malik, J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, IEEE, 580--587. Google ScholarDigital Library
Girshick, R. 2015. Fast r-cnn. arXiv preprint arXiv:1504.08083. Google ScholarDigital Library
Guo, Y., Liu, M., Gu, T., and Wang, W. 2012. Improving photo composition elegantly: Considering image similarity during composition optimization. In Computer Graphics Forum, Wiley Online Library, 2193--2202. Google ScholarDigital Library
HaCohen, Y., Shechtman, E., Goldman, D. B., and Lischinski, D. 2011. Non-rigid dense correspondence with applications for image enhancement. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH 2011) 30, 4, 70:1--70:9. Google ScholarDigital Library
Hariharan, B., Arbeláez, P., Girshick, R., and Malik, J. 2014. Hypercolumns for object segmentation and fine-grained localization. arXiv preprint arXiv:1411.5752.Google Scholar
He, K., Zhang, X., Ren, S., and Sun, J. 2015. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385.Google Scholar
Hertzmann, A., Jacobs, C. E., Oliver, N., Curless, B., and Salesin, D. H. 2001. Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, ACM, 327--340. Google ScholarDigital Library
Jacobs, D. E., Goldman, D. B., and Shechtman, E. 2010. Cosaliency: Where people look when comparing images. In Proceedings of the 23nd annual ACM symposium on User interface software and technology, ACM, 219--228. Google ScholarDigital Library
Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. In IEEE International Conference on Computer Vision (ICCV), IEEE.Google Scholar
Karayev, S., Trentacoste, M., Han, H., Agarwala, A., Darrell, T., Hertzmann, A., and Winnemoeller, H. 2013. Recognizing image style. arXiv preprint arXiv:1311.3715.Google Scholar
Kaufman, L., Lischinski, D., and Werman, M. 2012. Content-aware automatic photo enhancement. In Computer Graphics Forum, Wiley Online Library, 2528--2540. Google ScholarDigital Library
Ke, Y., Tang, X., and Jing, F. 2006. The design of high-level features for photo quality assessment. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, vol. 1, IEEE, 419--426. Google ScholarDigital Library
Khosla, A., Raju, A. S., Torralba, A., and Oliva, A. 2015. Understanding and predicting image memorability at a large scale. In International Conference on Computer Vision (ICCV). Google ScholarDigital Library
Kittur, A., Chi, E. H., and Suh, B. 2008. Crowdsourcing user studies with mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM, New York, NY, USA, CHI '08, 453--456. Google ScholarDigital Library
Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097--1105.Google Scholar
Liu, L., Chen, R., Wolf, L., and Cohen-Or, D. 2010. Optimizing photo composition. Computer Graphic Forum (Proceedings of Eurographics) 29, 2, 469--478.Google ScholarCross Ref
Long, J., Shelhamer, E., and Darrell, T. 2014. Fully convolutional networks for semantic segmentation. arXiv preprint arXiv:1411.4038.Google Scholar
Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2, 91--110. Google ScholarDigital Library
Lu, X., Lin, Z., Jin, H., Yang, J., and Wang, J. Z. 2014. Rapid: Rating pictorial aesthetics using deep learning. In Proceedings of the ACM International Conference on Multimedia, ACM, 457--466. Google ScholarDigital Library
Luo, Y., and Tang, X. 2008. Photo and video quality evaluation: Focusing on the subject. In Computer Vision--ECCV 2008. Springer, 386--399. Google ScholarDigital Library
Luo, W., Wang, X., and Tang, X. 2011. Content-based photo quality assessment. In Computer Vision (ICCV), 2011 IEEE International Conference on, IEEE, 2206--2213. Google ScholarDigital Library
Ma, Y.-F., Lu, L., Zhang, H.-J., and Li, M. 2002. A user attention model for video summarization. In Proceedings of the tenth ACM international conference on Multimedia, ACM, 533--542. Google ScholarDigital Library
Marchesotti, L., Perronnin, F., Larlus, D., and Csurka, G. 2011. Assessing the aesthetic quality of photographs using generic image descriptors. In Computer Vision (ICCV), 2011 IEEE International Conference on, IEEE, 1784--1791. Google ScholarDigital Library
Megvii Inc., 2013. Face++ research toolkit. www.faceplusplus.com.Google Scholar
Murray, N., Marchesotti, L., and Perronnin, F. 2012. Ava: A large-scale database for aesthetic visual analysis. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE, 2408--2415. Google ScholarDigital Library
Nishiyama, M., Okabe, T., Sato, I., and Sato, Y. 2011. Aesthetic quality classification of photographs based on color harmony. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 33--40. Google ScholarDigital Library
Oliva, A., and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. International journal of computer vision 42, 3, 145--175. Google ScholarDigital Library
Paige, C. C., and Saunders, M. A. 1982. Lsqr: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw. 8, 1, 43--71. Google ScholarDigital Library
Park, J., Lee, J.-Y., Tai, Y.-W., and Kweon, I. S. 2012. Modeling photo composition and its application to photo rearrangement. In Image Processing (ICIP), 2012 19th IEEE International Conference on, IEEE, 2741--2744.Google Scholar
Ralph Allan Bradley, M. E. T. 1952. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika 39, 3/4, 324--345.Google Scholar
Ren, X., and Malik, J. 2003. Learning a classification model for segmentation. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, IEEE, 10--17. Google ScholarDigital Library
Ren, S., He, K., Girshick, R., and Sun, J. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, 91--99. Google ScholarDigital Library
Ren, S., He, K., Girshick, R. B., Zhang, X., and Sun, J. 2015. Object detection networks on convolutional feature maps. CoRR abs/1504.06066.Google Scholar
Simon, I., Snavely, N., and Seitz, S. M. 2007. Scene summarization for online image collections. In ICCV, IEEE.Google Scholar
Simonyan, K., and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google Scholar
Sinha, P., Mehrotra, S., and Jain, R. 2011. Summarization of personal photologs using multidimensional content and context. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ACM, 4. Google ScholarDigital Library
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. 2014. Going deeper with convolutions. arXiv preprint arXiv:1409.4842.Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. 2015. Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567.Google Scholar
Tang, H., Joshi, N., and Kapoor, A. 2011. Learning a blind measure of perceptual image quality. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 305--312. Google ScholarDigital Library
Wang, X.-J., Zhang, L., and Liu, C. 2013. Duplicate discovery on 2 billion internet images. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on, IEEE, 429--436. Google ScholarDigital Library
Ye, P., Kumar, J., Kang, L., and Doermann, D. 2012. Unsupervised feature learning framework for no-reference image quality assessment. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE, 1098--1105. Google ScholarDigital Library
Yu, F., and Koltun, V. 2015. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122.Google Scholar
Yu, F., Zhang, Y., Song, S., Seff, A., and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.Google Scholar
Yuan, L., and Sun, J. 2012. Automatic exposure correction of consumer photographs. In Computer Vision--ECCV 2012. Springer, 771--785. Google ScholarDigital Library
Zhang, L., Song, M., Zhao, Q., Liu, X., Bu, J., and Chen, C. 2013. Probabilistic graphlet transfer for photo cropping. Image Processing, IEEE Transactions on 22, 2, 802--815. Google ScholarDigital Library
Zhou, E., Fan, H., Cao, Z., Jiang, Y., and Yin, Q. 2013. Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In Proceedings of the IEEE International Conference on Computer Vision Workshops, 386--391. Google ScholarDigital Library
Zhu, J.-Y., Agarwala, A., Efros, A. A., Shechtman, E., and Wang, J. 2014. Mirror mirror: Crowdsourcing better portraits. ACM Transactions on Graphics (TOG) 33, 6, 234. Google ScholarDigital Library

Index Terms

Automatic triage for a photo series
1. Computing methodologies
2. Information systems
  1. World Wide Web
    1. Web searching and information discovery
      1. Content ranking

Recommendations

Automatic tag expansion using visual similarity for photo sharing websites

In this paper we present an automatic photo tag expansion method designed for photo sharing websites. The purpose of the method is to suggest tags that are relevant to the visual content of a given photo at upload time. Both textual and visual cues are ...
Read More
Automatic organization of large photo collections
Read More
Semi-Automatic Tagging of Photo Albums via Exemplar Selection and Tag Inference

As one of the emerging Web 2.0 activities, tagging becomes a popular approach to manage personal media data, such as photo albums. A dilemma in tagging behavior is the users' manual efforts and the tagging accuracy: exhaustively tagging all photos in an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 35, Issue 4
July 2016
1396 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2897824
Issue’s Table of Contents

Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 July 2016
Published in tog Volume 35, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
benchmark
photo quality
photo triage
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 34
  Total Citations
  View Citations
- 682
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic triage for a photo series

ACM Transactions on Graphics

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Automatic tag expansion using visual similarity for photo sharing websites

Automatic organization of large photo collections

Semi-Automatic Tagging of Photo Albums via Exemplar Selection and Tag Inference

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic triage for a photo series

ACM Transactions on Graphics

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Automatic tag expansion using visual similarity for photo sharing websites

Automatic organization of large photo collections

Semi-Automatic Tagging of Photo Albums via Exemplar Selection and Tag Inference

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media