Article

Real-time tracking of visually attended objects in interactive virtual environments

Authors:

Gerard Jounghyun Kim,

Seungmoon ChoiAuthors Info & Claims

VRST '07: Proceedings of the 2007 ACM symposium on Virtual reality software and technology

Pages 29 - 38

https://doi.org/10.1145/1315184.1315187

Published: 05 November 2007 Publication History

Abstract

This paper presents a real-time framework for computationally tracking objects visually attended by the user while navigating in interactive virtual environments. In addition to the conventional bottom-up (stimulus-driven) features, the framework also uses topdown (goal-directed) contexts to predict the human gaze. The framework first builds feature maps using preattentive features such as luminance, hue, depth, size, and motion. The feature maps are then integrated into a single saliency map using the center-surround difference operation. This pixel-level bottom-up saliency map is converted to an object-level saliency map using the item buffer. Finally, the top-down contexts are inferred from the user's spatial and temporal behaviors during interactive navigation and used to select the most plausibly attended object among candidates produced in the object saliency map. The computational framework was implemented using the GPU and exhibited extremely fast computing performance (5.68 msec for a 256X256 saliency map), substantiating its adequacy for interactive virtual environments. A user experiment was also conducted to evaluate the prediction accuracy of the visual attention tracking framework with respect to actual human gaze data. The attained accuracy level was well supported by the theory of human cognition for visually identifying a single and multiple attentive targets, especially due to the addition of top-down contextual information. The framework can be effectively used for perceptually based rendering without employing an expensive eye tracker, such as providing the depth-of-field effects and managing the level-of-detail in virtual environments.

References

[1]

Awh, E., and Pashler, H. 2000. Evidence for split attentional foci. Journal of Experimental Psychology 26, 2, 834--846.

[2]

Backer, G., Mertsching, B., and Bollmann, M. 2001. Data- and model-driven gaze control for an active-vision system. IEEE Trans. Pattern Analysis and Machine Intelligence 23, 12, 1415--1429.

Digital Library

[3]

Beeharee, A. K., West, A. J., and Hubbold, R. J. 2003. Visual attention based information culling for distributed virtual environments. In Proceedings of ACM Symposium on Virtual Reality Software and Technology, 213--222.

Digital Library

[4]

Brown, R., Cooper, L., and Pham, B. 2003. Visual attentionbased polygon level of detail management. In Proceedings of GRAPHITE 2003, 55--62.

Digital Library

[5]

Burns, D., and Osfield, R., 2006. OpenSceneGraph. http://www.openscenegraph.org.

[6]

Cater, K., Chalmers, A., and Ledda, P. 2002. Selective quality rendering by exploiting human inattentional blindness: looking but not seeing. In Proceedings of ACM Symposium on Virtual Reality Software and Technology, 17--24.

Digital Library

[7]

Connor, C., Egeth, H., and Yantis, S. 2004. Visual attention: Bottom-up vs. top-down. Current Biology 14, 19, 850--852.

[8]

Culhane, S. M., and Tsotsos, J. K. 1992. An attentional prototype for early vision. In Proceedings of European Conference on Computer Vision 1992, 551--560.

Digital Library

[9]

Engel, S., Zhang, X., and Wandell, B. 1997. Colour tuning in human visual cortex measured with functional magnetic resonance imaging. Nature 388, 6637, 68--71.

[10]

Enns, J. T. 1990. Three-dimensional features that pop out in visual search. In Visual Search, Taylor and Francis, Eds. New York, 37--45.

[11]

Haber, J., Myszkowski, K., Yamauchi, H., and Seidel, H.-P. 2001. Perceptually guided corrective splatting. Computer Graphics Forum 20, 3.

[12]

Henderson, J. M. 2003. Human gaze control during real-world scene perception. Trends in Cognitive Sciences 7, 11, 498--504.

[13]

Itti, L., Koch, C., and Niebur, E. 1998. A model of saliencybased visual attention for rapid scene analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 11, 1254--1259.

Digital Library

[14]

Itti, L. 2000. Models of Bottom-Up and Top-Down Visual Attention. PhD thesis, California Institute of Technology, Pasadena, California.

Digital Library

[15]

Jobson, D. J., ur Rahman, Z., and Woodell, G. A. 1997. Properties and performance of a center/surround retinex. IEEE Trans. on Image Processing 6, 3, 451--462.

Digital Library

[16]

Kalman, R. E. 1960. A new approach to linear filtering and predictive problems. Trans. ASME, Journal of basic engineering 82, 34--45.

[17]

Kessenich, J., Baldwin, D., and Rost, R., 2004. The OpenGL Shading Language. Version 1.10.59. 3Dlabs, Inc. Ltd. http://developer.3dlabs.com/documents/index.htm.

[18]

Koch, C., and Ullman, S. 1985. Shifts in selective visual attention. Human Neurobiology 4, 219--227.

[19]

Kuipers, B. 1978. Modeling spatial knowledge. Cognitive Science 2, 129--153.

[20]

Lee, C. H., Varshney, A., and Jacobs, D. W. 2005. Mesh saliency. In Proceedings of SIGGRAPH 2005, 659--666.

Digital Library

[21]

Loftus, G. R., and Mackworth, N. H. 1978. Cognitive determinants of fixation duration during picture viewing. Journal of Experimental Psychology 4, 565--572.

[22]

Longhurst, P., Debattista, K., and Chalmers, A. 2006. A GPU based saliency map for high-fidelity selective rendering. In Proceedings of AFRIGRAPH 2006, 21--29.

Digital Library

[23]

Ma, Y.-F., Hua, X.-S., Lu, L., and Zhang, H. 2005. A generic framework of user attention model and its application in video summarization. IEEE Trans. on Multimedia 7, 5, 907--919.

Digital Library

[24]

Marshall, J., Burbeck, C., Ariely, D., Rolland, J., and Martin, K. 1996. Occlusion edge blur: a cue to relative visual depth. Journal of the Optical Society of America 13, 681--688.

[25]

Mather, G. 1997. The use of image blur as a depth cue. Perception 26, 1147--1158.

[26]

Mozer, M. C., and Sitton, M. 1998. Computational modeling of spatial attention. In Attention, H. Pashler, Ed. UCL Press, London, 341--393.

[27]

Nagy, A. L., and Sanchez, R. R. 1990. Critical color differences determined with a visual search task. Journal of the Optical Society of America 7, 7, 1209--1217.

[28]

Nakayama, K., and Silverman, G. 1986. Serial and parallel processing of visual feature conjunctions. Nature 320, 264--265.

[29]

O'Craven, K. M., Downing, P. E., and Kanwisher, N. 1999. fMRI evidence for objects as the units of attentional selection. Nature 401, 6753, 584--587.

[30]

OpenCV, 2006. http://sourceforge.net/projects/opencvlibrary/.

[31]

Ouerhani, N., Bracamonte, J., Hugli, H., Ansorge, M., and Pellandini, F. 2001. Adaptive color image compression based on visual attention. In Proceedings of ICIAP, 416--421.

[32]

Ouerhani, N., von Wartburg, R., and Hugli, H. 2004. Empirical validation of the saliency-based model of visual attention. Electronic Letters on Computer Vision and Image Analysis 3, 1, 13--24.

[33]

Rutishauser, U., Walther, D., Koch, C., and Perona, P. 2004. Is bottom-up attention useful for object recognition? In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2004, 37--44.

Digital Library

[34]

Santella, A., and DeCarlo, D. 2004. Visual interest and NPR: an evaluation and manifesto. In Proceedings of Symposium on Non-Photorealistic Animation and Rendering, 71--150.

Digital Library

[35]

Sears, C., and Pylyshyn, Z. 2000. Multiple object tracking and attentional processing. Journal of Experimental Psychology 54, 1, 1--14.

[36]

Siegel, A. W., and White, S. H. 1975. The development of spatial representations of large-scale environments. In Advances in Child Development and Behavior, H. Reese, Ed., vol. 10. Academic Press, New York, 10--55.

[37]

Speed, F. M., Hocking, R. R., and Hackney, O. P. 1978. Methods of analysis of linear models with unbalanced data. Journal of the American Statistical Association 73, 361, 105--112.

[38]

Treisman, A. M., and Gelade, G. 1980. A feature-integration theory of attention. Cognitive Psychology 12, 97--136.

[39]

Vishton, P., and Cutting, J. 1995. Wayfinding, displacements, and mental maps: velocity field are not typically used to determine one's aimpoint. Journal of Experimental Psychology 21, 978--995.

[40]

Watson, B., Walker, N., Hodges, L. F., and Worden, A. 1997. Managing level of detail through peripheral degradation: Effects on search performance with a head-mounted display. ACM Trans. on Computer-Human Interaction 4, 4, 323--346.

Digital Library

[41]

Weghorst, H., Hooper, G., and Greenberg, D. P. 1984. Improved computational methods for ray tracing. ACM Trans. on Graphics 3, 1, 52--69.

Digital Library

[42]

Welsh, G., and Bishop, G. 1995. An introduction to the Kalman filter. Tech. Rep. 95--041, Univ. of North Carolina at Chapel Hill.

Digital Library

[43]

Wolfe, and Jeremy, M. 1993. Guided search 2.0. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting, 1295--1299.

[44]

Yee, H., Pattanaik, S., and Greenberg, D. P. 2001. Spatiotemporal sensitivity and visual attention for efficient rendering of dynamic environments. ACM Trans. on Graphics 20, 39--65.

Digital Library

Cited By

Galea SDebattista KSpina S(2014)GPU-Based Selective Sparse Sampling for Interactive High-Fidelity Rendering2014 6th International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES)10.1109/VS-Games.2014.7012159(1-8)Online publication date: Sep-2014
https://doi.org/10.1109/VS-Games.2014.7012159
Sundstedt V(2012)Gazing at Games: An Introduction to Eye Tracking ControlSynthesis Lectures on Computer Graphics and Animation10.2200/S00395ED1V01Y201111CGR0145:1(1-113)Online publication date: 5-Mar-2012
https://doi.org/10.2200/S00395ED1V01Y201111CGR014
Ozturk OMatsunami TSuzuki YYamasaki TAizawa K(2012)Real-time tracking of humans and visualization of their future footsteps in public indoor environmentsMultimedia Tools and Applications10.1007/s11042-010-0691-z59:1(65-88)Online publication date: 1-Jul-2012
https://dl.acm.org/doi/10.1007/s11042-010-0691-z
Show More Cited By

Index Terms

Real-time tracking of visually attended objects in interactive virtual environments
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Cognitive robotics
    2. Philosophical/theoretical foundations of artificial intelligence
      1. Cognitive science
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Virtual reality

Recommendations

Real-Time Tracking of Visually Attended Objects in Virtual Environments and Its Application to LOD

This paper presents a real-time framework for computationally tracking objects visually attended by the user while navigating in interactive virtual environments. In addition to the conventional bottom-up (stimulus-driven) saliency map, the proposed ...
Exploiting contrast cues for salient region detection

Visual saliency detection is an important cue used in human visual system, which can offer efficient solutions for both biological and artificial vision systems. Although there are many saliency detection models that can achieve good results on public ...
Depth Perception Tendencies in the 3-D Environment of Virtual Reality
Computer Vision and Graphics
Abstract
The human brain is not able to process the vast amount of visual information that originates in the environment around us. Therefore, a complex process of human visual attention based on the principles of selectivity and prioritization helps us to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

VRST '07: Proceedings of the 2007 ACM symposium on Virtual reality software and technology

November 2007

259 pages

ISBN:9781595938633

DOI:10.1145/1315184

Conference Chairs:
Aditi Majumder
UC Irvine
,
Larry Hodges
UNC - Charlotte
,
Daniel Cohen-Or
Tel Aviv University
,
Editor:
Stephen N. Spencer
University of Washington

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

VRST07

Sponsor:

VRST07: The ACM Symposium on Virtual Reality Software and Technology

November 5 - 7, 2007

California, Newport Beach

Acceptance Rates

Overall Acceptance Rate 66 of 254 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
676
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Galea SDebattista KSpina S(2014)GPU-Based Selective Sparse Sampling for Interactive High-Fidelity Rendering2014 6th International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES)10.1109/VS-Games.2014.7012159(1-8)Online publication date: Sep-2014
https://doi.org/10.1109/VS-Games.2014.7012159
Sundstedt V(2012)Gazing at Games: An Introduction to Eye Tracking ControlSynthesis Lectures on Computer Graphics and Animation10.2200/S00395ED1V01Y201111CGR0145:1(1-113)Online publication date: 5-Mar-2012
https://doi.org/10.2200/S00395ED1V01Y201111CGR014
Ozturk OMatsunami TSuzuki YYamasaki TAizawa K(2012)Real-time tracking of humans and visualization of their future footsteps in public indoor environmentsMultimedia Tools and Applications10.1007/s11042-010-0691-z59:1(65-88)Online publication date: 1-Jul-2012
https://dl.acm.org/doi/10.1007/s11042-010-0691-z
Veas EMendez EFeiner SSchmalstieg DTan DFitzpatrick GGutwin CBegole BKellogg W(2011)Directing attention and influencing memory with visual saliency modulationProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/1978942.1979158(1471-1480)Online publication date: 7-May-2011
https://dl.acm.org/doi/10.1145/1978942.1979158
Rahman AHouzet DPellerin DMarat SGuyader N(2011)Parallel implementation of a spatio-temporal visual saliency modelJournal of Real-Time Image Processing10.1007/s11554-010-0164-76:1(3-14)Online publication date: 1-Mar-2011
https://dl.acm.org/doi/10.1007/s11554-010-0164-7
Mendez EFeiner SSchmalstieg D(2010)Focus and context in mixed reality by modulating first order salient featuresProceedings of the 10th international conference on Smart graphics10.5555/1894345.1894374(232-243)Online publication date: 24-Jun-2010
https://dl.acm.org/doi/10.5555/1894345.1894374
Bernhard MStavrakis EWimmer M(2010)An empirical pipeline to derive gaze prediction heuristics for 3D action gamesACM Transactions on Applied Perception10.1145/1857893.18578978:1(1-30)Online publication date: 10-Nov-2010
https://dl.acm.org/doi/10.1145/1857893.1857897
Yang GLiu H(2010)Visual attention & multi-cue fusion based human motion tracking method2010 Sixth International Conference on Natural Computation10.1109/ICNC.2010.5584296(2044-2054)Online publication date: Aug-2010
https://doi.org/10.1109/ICNC.2010.5584296
Mendez EFeiner SSchmalstieg D(2010)Focus and Context in Mixed Reality by Modulating First Order Salient FeaturesSmart Graphics10.1007/978-3-642-13544-6_22(232-243)Online publication date: 2010
https://doi.org/10.1007/978-3-642-13544-6_22
Sundstedt VWhitton MBloj M(2009)The whys, how tos, and pitfalls of user studiesACM SIGGRAPH 2009 Courses10.1145/1667239.1667264(1-205)Online publication date: 3-Aug-2009
https://dl.acm.org/doi/10.1145/1667239.1667264
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten