ABSTRACT
Previous research on content-based geolocation in general has developed prediction methods via conducting pre-partitioning and applying classification methods. The input of these methods is the concatenation of individual tweets during a period of time. But unfortunately, these methods have some drawbacks. They discard the natural real-values properties of latitude and longitude as well as fail to capture geolocation in near real-time. In this work, we develop a novel generative content-based regression model via a matrix factorization technique to tackle the near real-time geolocation prediction problem. With this model, we aim to address a couple of un-answered questions. First, we prove that near real-time geolocation prediction can be accomplished if we leave out the concatenation. Second, we account the real-values properties of physical coordinates within a regression solution. We apply our model on Twitter datasets as an example to prove the effectiveness and generality. Our experimental results show that the proposed model, in the best scenario, outperforms a set of state-of-the-art regression models including Support Vector Machines and Factorization Machines by a reduction of the median localization error up to 79%.
- C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Google ScholarDigital Library
- J. Eisenstein, A. Ahmed, and E. P. Xing. Sparse additive generative models of text. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1041--1048, 2011.Google Scholar
- J. Eisenstein, B. O'Connor, N. A. Smith, and E. P. Xing. A latent variable model for geographic lexical variation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1277--1287. Association for Computational Linguistics, 2010. Google ScholarDigital Library
- R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarDigital Library
- B. Han, P. Cook, and T. Baldwin. Geolocation prediction in social media data by finding location indicative words. Proceedings of COLING 2012: Technical Papers, pages 1045--1062, 2012.Google Scholar
- B. Han, P. Cook, and T. Baldwin. Text-based twitter user geolocation prediction. Journal of Artificial Intelligence Research, pages 451--500, 2014. Google ScholarDigital Library
- L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In Proceedings of the 21st international conference on World Wide Web, pages 769--778. ACM, 2012. Google ScholarDigital Library
- S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology (TIST), 3(3):57, 2012. Google ScholarDigital Library
- S. Roller, M. Speriosu, S. Rallapalli, B. Wing, and J. Baldridge. Supervised text-based geolocation using language models on an adaptive grid. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1500--1510. Association for Computational Linguistics, 2012. Google ScholarDigital Library
- B. Wing and J. Baldridge. Hierarchical discriminative classification for text-based geolocation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 336--348, 2014.Google ScholarCross Ref
- B. P. Wing and J. Baldridge. Simple supervised document geolocation with geodesic grids. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 955--964. Association for Computational Linguistics, 2011. Google ScholarDigital Library
Index Terms
- Near Real-time Geolocation Prediction in Twitter Streams via Matrix Factorization Based Regression
Recommendations
Points-of-interest recommendation based on convolution matrix factorization
A point-of-interest(POI) recommendation aims to mine a user's visiting history and find her/his potentially preferred places. The decision process when choosing a POI is complex and can be influenced by numerous factors, including personal preferences, ...
Discovering geographical topics in the twitter stream
WWW '12: Proceedings of the 21st international conference on World Wide WebMicro-blogging services have become indispensable communication tools for online users for disseminating breaking news, eyewitness accounts, individual expression, and protest groups. Recently, Twitter, along with other online social networking services ...
Geo-Pairwise Ranking Matrix Factorization Model for Point-of-Interest Recommendation
Neural Information ProcessingAbstractPoint-of-interest (POI) recommendation that suggests new locations for people to visit is an important application in location-based social networks (LBSNs). Compared with traditional recommendation problems, e.g., movie recommendation, ...
Comments