ABSTRACT
We present a recognition-based user tracking and augmented reality system that works in extreme large scale areas. The system will provide a user who captures an image of a building facade with precise location of the building and augmented information about the building. While GPS cannot provide information about camera poses, it is needed to aid reducing the searching ranges in image database. A patch-retrieval method is used for efficient computations and real-time camera pose recovery. With the patch matching as the prior information, the whole image matching can be done through propagations in an efficient way so that a more stable camera pose can be generated. Augmented information such as building names and locations are then delivered to the user. The proposed system mainly contains two parts, offline database building and online user tracking. The database is composed of images for different locations of interests. The locations are clustered into groups according to their UTM coordinates. An overlapped clustering method is used to cluster these locations in order to restrict the retrieval range and avoid ping pong effects. For each cluster, a vocabulary tree is built for searching the most similar view. On the tracking part, the rough location of the user is obtained from the GPS and the exact location and camera pose are calculated by querying patches of the captured image. The patch property makes the tracking robust to occlusions and dynamics in the scenes. Moreover, due to the overlapped clusters, the system simulates the "soft handoff" feature and avoid frequent swaps in memory resource. Experiments show that the proposed tracking and augmented reality system is efficient and robust in many cases.
Supplemental Material
- S. Agarwal, N. Snavely, I. Simon, S. Seitz, and R. Szeliski. Building rome in a day. In Proceedings of the International Conference on Computer Vision (ICCV), 2009.Google ScholarCross Ref
- P. Azad, T. Asfour, and R. Dillmann. Combining harris interest points and the sift descriptor for fast scale- invariant object recognition. In EEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2009. Google ScholarDigital Library
- H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3):346--359, 2008. Google ScholarDigital Library
- F. Dellaert, W. Burgard, D. Fox, and S. Thrun. Using the condensation algorithm for robust, vision-based mobile robot localization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1999.Google ScholarCross Ref
- Z. Dodds and G. D. Hager. A color interest operator for landmark-based navigation. In Proceedings of the National Conference on Artificial Intelligence (AAAI), pages 655--660, 1997. Google ScholarDigital Library
- N. Henze, T. Schinke, and S. Boll. What is that? object recognition from natural features on a mobile phone. In Proceedings of the Workshop on Mobile Interaction with the Real World, 2009.Google Scholar
- I. Horswill. Polly: A visisn-based artificial agent. In Proceedings of the National Conference on Artificial Intelligence (AAAI), pages 824--829, 1993. Google ScholarDigital Library
- D. Kortenkamp and T. Weymouth. Topological mapping for mobile robots using a combination of sonar and vision sensing. In Proceedings of the National Conference on Artificial Intelligence (AAAI), pages 1972--1978, 1994. Google ScholarDigital Library
- J. Kosecka, F. Li, and X. Yang. Global localization and relative positioning based on scale-invariant keypoints. Robotics and Autonomous Systems, 52(1), 2005.Google Scholar
- B. Krose and R. Bunschoten. Probabilistic localization by appearance models and active vision. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 1999.Google ScholarCross Ref
- D. Lowe. Object recognition from local scaleinvariant features. In Proceedings of the Seventh International Conference on Computer Vision (ICCV), 1999. Google ScholarDigital Library
- K. Mikolajcyk and C. Schmid. An affine invariant interest point detector. In Proceedings of the International Conference on Computer Vision (ICCV), 2002. Google ScholarDigital Library
- D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006. Google ScholarDigital Library
- E. Rosten and T. Drummond. Machine learning for high-speed corner detection. In Proceedings of the European Conference on Computer Vision (ECCV), pages 430--443, 2006. Google ScholarDigital Library
- S. Se, D. Lowe, and J. Little. Vision-based mobile robot localization and mapping using scale-invariant features. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 2051--2058, 2001.Google ScholarCross Ref
- H. Tamimi, H. Andreasson, A. Treptow, T. Duckett, and A. Zell. Localization of mobile robots with omnidirectional vision using particle filter and iterative sift. In Proceedings of the European Conference on Mobile Robots (ECMR), 2005.Google Scholar
- I. Ulrich and I. Nourbakhsh. Appearance-based place recognition for topological localization. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2000.Google ScholarCross Ref
- D. Wagner, G. Reitmayr, A. Mulloni, T. Drummond, and D. Schmalstieg. Pose tracking from natural features on mobile phones. In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2008. Google ScholarDigital Library
- J. Wolf, W. Burgard, and H. Burkhardt. Robust vision-based localization for mobile robots using an image retrieval system based on invariant features. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2002.Google ScholarCross Ref
Index Terms
- GPS-aided recognition-based user tracking system with augmented reality in extreme large-scale areas
Recommendations
An Extended Marker-Based Tracking System for Augmented Reality
WMSVM '10: Proceedings of the 2010 Second International Conference on Modeling, Simulation and Visualization MethodsFiducial marker systems consist of unique patterns mounted in the environment and computer vision algorithms that help automatically find features in digital camera images. They are useful for Augmented Reality (AR), robot navigation, 3D modeling, and ...
Marker Tracking and HMD Calibration for a Video-Based Augmented Reality Conferencing System
IWAR '99: Proceedings of the 2nd IEEE and ACM International Workshop on Augmented RealityWe describe an augmented reality conferencing system which uses the overlay of virtual images on the real world. Remote collaborators are represented on Virtual Monitors which can be freely positioned about a user in space. Users can collaboratively ...
Calibration-Free Augmented Reality in Perspective
This paper deals with video-based augmented reality and proposes an algorithm for augmenting a real video sequence with views of graphics objects without metric calibration of the video camera by representing the motion of the video camera in projective ...
Comments