Article

SEVA: sensor-enhanced video annotation

Authors:

Prashant ShenoyAuthors Info & Claims

MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia

Pages 618 - 627

https://doi.org/10.1145/1101149.1101290

Published: 06 November 2005 Publication History

Abstract

In this paper, we study how a sensor-rich world can be exploited by digital recording devices such as cameras and camcorders to improve a user's ability to search through a large repository of image and video files. We design and implement a digital recording system that records identities and locations of objects (as advertised by their sensors) along with visual images (as recorded by a camera). The process, which we refer to as sensor-enhanced video annotation (SEVA), combines a series of correlation, interpolation, and extrapolation techniques. It produces a tagged stream that later can be used to efficiently search for videos or frames containing particular objects or people. We present detailed experiments with a prototype of our system using both stationary and mobile objects as well as GPS and ultrasound. Our experiments show that: (i) SEVA has zero error rates for static objects, except very close to the boundary of the viewable area; (ii) for moving objects or a moving camera, SEVA only misses objects leaving or entering the viewable area by 1-2 frames; (iii) SEVA can scale to 10 fast moving objects using current sensor technology; and (iv) SEVA runs online using relatively inexpensive hardware.

References

[1]

K. Aizama, D. Tancharoen, S. Kawasaki, and T. Yamasaki. Efficient retrieval of life log based on context and content. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experience (CARPE'04), New York, NY, pages 22--31, October 2004.

Digital Library

[2]

R. Bajaj, S. L. Ranaweera, and D. P. Agrawal. Gps: Location-tracking technology. Computer, 35(4):92--94, March 2002.

Digital Library

[3]

Crossbow technology inc. http://www.xbow.com.

[4]

M. Davis, S. King, N. Good, and R. Sarvas. From context to content: Leveraging context to infer media metadata. In Proceedings of the 12th annual ACM International Conference on Multimedia (MM'04), New York, NY, pages 188--195, October 2004.

Digital Library

[5]

Deluo gps waas. http://www.deluoelectronics.com/.

[6]

J. L. Devore. Probability and Statistics for Engineering and the Sciences. Brooks/Cole, fifth edition, 1999.

[7]

D. P. W. Ellis and K. Lee. Minimal-impact audio-based personal archives. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experience (CARPE'04), New York, NY, pages 39--47, October 2004.

Digital Library

[8]

J. Fan, Y. Gao, and H. Luo. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In Proceedings of the 12th annual ACM International Conference on Multimedia (MM'04), New York, NY, pages 540--547, October 2004.

Digital Library

[9]

H. Feng, R. Shi, and T. Chua. A bootstrapping framework for annotating and retrieving www images. In Proceedings of the 12th annual ACM International Conference on Multimedia (MM'04), New York, NY, pages 960--967, October 2004.

Digital Library

[10]

Ffmpeg 0.4.8. http://ffmpeg.sourceforge.net/index.php.

[11]

K. Finkenzeller. RFID Handbook: Fundamentals and Applications in Contactless Smart Cards and Identification. John Willey & Sons, second edition, 2003.

Digital Library

[12]

Galileo. http://en.wikipedia.org/wiki/Galileo_positioning_system.

[13]

J. Gemmell, G. Bell, R. Lueder, S. Drucker, and C. Wong. Mylifebits: Fulfilling the memex vision. In Proceedings of the 10th annual ACM International Conference on Multimedia (MM'02), Juan Les Pins, France, pages 235--238, December 2002.

Digital Library

[14]

J. Gemmell, L. Williams, K. Wood, R. Lueder, and G. Bell. Passive capture and ensuing issues for a personal lifetime store. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experience (CARPE'04), New York, NY, pages 48--55, October 2004.

Digital Library

[15]

find the latitude and longitude of any us address. http://www.geocoder.us.

[16]

Why modernize gps? http://www.gps.oma.be/gb/modern_gb_ok_css.htm.

[17]

Gpsdrive 2.09. http://www.gpsdrive.cc/.

[18]

R. Grimm. System support for pervasive applications. PhD thesis, University of Washington, Department of Computer Science and Engineering, December 2002.

Digital Library

[19]

J. Hightower and G. Borriello. Location systems for ubiquitous computing. Computer, 34(8):57--66, August 2001.

Digital Library

[20]

J. Hightower, R. Want, and G. Borriello. Spoton: An indoor 3d location sensing technology based on rf signal strength. Technical Report 00-02-02, University of Washington, 2000.

[21]

J. Hill and D. Culler. Mica: a wireless platform for deeply embedded networks. IEEE Micro, 22(6):1224, November/December 2002.

Digital Library

[22]

R. Jin, J. Y. Chai, and L. Si. Effective automatic image annotation via a coherent language model and active learning. In Proceedings of the 12th annual ACM International Conference on Multimedia (MM'04), New York, NY, pages 892--899, October 2004.

Digital Library

[23]

B. Johanson, A. Fox, and T. Winograd. The interactive workspaces project: Experiences with ubiquitous computing rooms. IEEE Pervasive Computing, 1(2), 2002.

Digital Library

[24]

T. Kindberg and et. al. People, places, things: Web presence for the real world. Mobile Networks, 7(5), October 2002.

Digital Library

[25]

B. Li and K. Goh. Confidence-based dynamic ensemble for image annotation and semantics discovery. In Proceedings of the 11th annual ACM International Conference on Multimedia (MM'03), Berkeley, CA, pages 195--206, November 2003.

Digital Library

[26]

D. Lymberopoulos and A. Savvides. XYZ: A motion-enabled, power aware sensor node platform for distributed sensor network applications. In Proceedings of Information Processing in Sensor Networks (ISPN), Los Angeles, CA, April 2005.

Digital Library

[27]

A. Mainwaring, J. Polastre, R. Szewczyk, D. Culler, and J. Anderson. Wireless sensor networks for habitat monitoring. In Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications (WSNA'02), Atlanta, GA, pages 88--97, September 2002.

Digital Library

[28]

B. S. Manjunath, P. Salembier, and T. Sikora. Introduction to MPEG 7: Multimedia Content Description Language. John Wiley & Sons, firth edition, 2002.

Digital Library

[29]

M. Naaman, A. Paepcke, and H. Garcia-Molina. From where to what: Metadata sharing for digital photographs with geographic coordinates. In Proceedings of the 10th International Conference on Cooperative Information Systems (CoopIS'03), Catania, Sicily, pages 196--217, November 2003.

[30]

F. Nack and W. Putz. Designing annotation before it's needed. In Proceedings of the 9th annual ACM International Conference on Multimedia (MM'01), Ottawa, Canada, pages 251--260, September 2001.

Digital Library

[31]

L. M. Ni, Y. Liu, Y. C. Lau, and A. P. Patil. Landmarc: Indoor location sensing using active rfid. In Proceedings of the 1st IEEE International Conference on Pervasive Computing and Communications (PerCom'03), Dallas-Fort Worth, TX, pages 407--417, March 2003.

Digital Library

[32]

J. Polastre, R. Szewczyk, and D. Culler. Telos: Enabling ultra-low power wireless research. In Proceedings of the 4th International Conference on Information Processing in Sensor Networks: Special track on Platform Tools and Design Methods for Network Embedded Sensors (IPSN/SPOTS), April 2005.

Digital Library

[33]

N. B. Priyantha, A. Chakraborty, and H. Balakrishnan. The cricket location-support system. In Proceedings of the 6th annual ACM International Conference on Mobile Computing and Networking (MobiCom'00), Boston, MA, pages 32--43, August 2000.

Digital Library

[34]

M. Roman, C. Hess, and R. Campbell. Gaia: An oo middleware infrastructure for ubiquitous computing environments. In ECOOP Workshop on Object-Orientation and Operating Systems, Malaga, Spain, June 2002.

[35]

A. Smith, H. Balakrishnan, M. Goraczko, and N. Priyantha. Tracking moving devices with the cricket location system. In Proceedings of the 2nd ACM International Conference on Mobile Systems, Applications, and Services (MobiSys'04), Boston, MA, pages 190--202, June 2004.

Digital Library

[36]

N. M. Su, H. Park, E. Bostrom, J. Burke, M. B. Srivastava, and D. Estrin. Augmemting film and video footage with sensor data. In Proceedings of the 2nd IEEE Annual Conference on Pervasive Computing and Communications (PerComm'04), Orlando, FL, pages 3--12, March 2004.

Digital Library

[37]

K. Toyama, R. Logan, and A. Roseway. Geographic location tags on digital images. In Proceedings of the 11th annual ACM International Conference on Multimedia (MM'03), Berkeley, CA, pages 156--166, November 2003.

Digital Library

[38]

L. Zhang, Y. Hu, M. Li, W. Ma, and H. Zhang. Effective propagation for face annotation in family albums. In Proceedings of the 12th annual ACM International Conference on Multimedia (MM'04), New York, NY, pages 716--723, October 2004.

Digital Library

Cited By

Khan UMartinez-Del-Amor MAltowaijri SAhmed ARahman ASama NHaseeb KIslam N(2020)Movie Tags Prediction and Segmentation Using Deep LearningIEEE Access10.1109/ACCESS.2019.29635358(6071-6086)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2019.2963535
Xiu WGao ZLiang WQi WPeng X(2018)Information Management and Target Searching in Massive Urban Video Based on Video-GIS2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)10.1109/ICEIEC.2018.8473519(228-232)Online publication date: Jun-2018
https://doi.org/10.1109/ICEIEC.2018.8473519
Khan UEjaz NMartinez-del-Amor MSparenberg H(2017)Movies tags extraction using deep learning2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)10.1109/AVSS.2017.8078459(1-6)Online publication date: Aug-2017
https://doi.org/10.1109/AVSS.2017.8078459
Show More Cited By

Index Terms

SEVA: sensor-enhanced video annotation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

SEVA: Sensor-enhanced video annotation

In this article, we study how a sensor-rich world can be exploited by digital recording devices such as cameras and camcorders to improve a user's ability to search through a large repository of image and video files. We design and implement a digital ...
Mobile Seva-Enabling mGovernance in India
CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Mobile Governance (m-Governance) is a new channel or access method to deliver government service to all citizens. M-Governance provides an additional access tool for e-Government and its processes with the uses of wireless and mobile technologies to ...
A heuristic evaluation of a mobile annotation tool
WebMedia '13: Proceedings of the 19th Brazilian symposium on Multimedia and the web

Modern mobile devices are natural multimedia devices that enable one to access, manage and transmit multiple types of media such as video, photo, audio and maps. Video playing on these devices is becoming part of everyday life for many users. Aiming to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia

November 2005

1110 pages

ISBN:1595930442

DOI:10.1145/1101149

General Chairs:
Hongjiang Zhang
Microsoft Research Asia, China
,
Tat-Seng Chua
National University of Singapore, Singapore
,
Program Chairs:
Ralf Steinmetz
Technische Universitat Darmstadt, Germany
,
Mohan Kankanhalli
National University of Singapore, Singapore
,
Lynn Wilcox
FXPAL

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM05

Sponsor:

MM05: 2005 13th Annual ACM International Conference on Multimedia

November 6 - 11, 2005

Hilton, Singapore

Acceptance Rates

MULTIMEDIA '05 Paper Acceptance Rate 49 of 312 submissions, 16%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
499
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Khan UMartinez-Del-Amor MAltowaijri SAhmed ARahman ASama NHaseeb KIslam N(2020)Movie Tags Prediction and Segmentation Using Deep LearningIEEE Access10.1109/ACCESS.2019.29635358(6071-6086)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2019.2963535
Xiu WGao ZLiang WQi WPeng X(2018)Information Management and Target Searching in Massive Urban Video Based on Video-GIS2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)10.1109/ICEIEC.2018.8473519(228-232)Online publication date: Jun-2018
https://doi.org/10.1109/ICEIEC.2018.8473519
Khan UEjaz NMartinez-del-Amor MSparenberg H(2017)Movies tags extraction using deep learning2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)10.1109/AVSS.2017.8078459(1-6)Online publication date: Aug-2017
https://doi.org/10.1109/AVSS.2017.8078459
Yin YSeo BZimmermann R(2015)Content vs. ContextACM Transactions on Multimedia Computing, Communications, and Applications10.1145/270028711:3(1-21)Online publication date: 5-Feb-2015
https://dl.acm.org/doi/10.1145/2700287
Kim SLu YConstantinou GShahabi CWang GZimmermann RZimmermann R(2014)MediaQProceedings of the 5th ACM Multimedia Systems Conference10.1145/2557642.2578223(224-235)Online publication date: 19-Mar-2014
https://dl.acm.org/doi/10.1145/2557642.2578223
Han ZKong YFen QFu P(2014)Video Retrieval Methods Using Geographic Information in Windows Azure Cloud2014 IEEE International Conference on Data Mining Workshop10.1109/ICDMW.2014.135(1113-1119)Online publication date: Dec-2014
https://doi.org/10.1109/ICDMW.2014.135
Ma HArslan Ay SZimmermann RKim S(2014)Large-scale geo-tagged video indexing and queriesGeoinformatica10.1007/s10707-013-0199-618:4(671-697)Online publication date: 1-Oct-2014
https://dl.acm.org/doi/10.1007/s10707-013-0199-6
Yu NHua KLiu D(2013)Client-Side Relevance Feedback Approach for Image Retrieval in Mobile EnvironmentMultimedia Data Engineering Applications and Processing10.4018/978-1-4666-2940-0.ch010(193-204)Online publication date: 2013
https://doi.org/10.4018/978-1-4666-2940-0.ch010
Yu NHua KLiu D(2012)Client-Side Relevance Feedback Approach for Image Retrieval in Mobile EnvironmentWireless Technologies10.4018/978-1-61350-101-6.ch314(724-736)Online publication date: 2012
https://doi.org/10.4018/978-1-61350-101-6.ch314
Ma HZimmermann RKim SCruz IKnoblock CKröger PTanin EWidmayer P(2012)HUGVidProceedings of the 20th International Conference on Advances in Geographic Information Systems10.1145/2424321.2424362(319-328)Online publication date: 6-Nov-2012
https://dl.acm.org/doi/10.1145/2424321.2424362
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten