short-paper

Near Real-time Geolocation Prediction in Twitter Streams via Matrix Factorization Based Regression

Authors:
Nghia Duong-Trung

University of Hildesheim, Hildesheim, Germany

University of Hildesheim, Hildesheim, Germany
View Profile

,
Nicolas Schilling

University of Hildesheim, Hildesheim, Germany

University of Hildesheim, Hildesheim, Germany
View Profile

,
Lars Schmidt-Thieme

University of Hildesheim, Hildesheim, Germany

University of Hildesheim, Hildesheim, Germany
View Profile

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementOctober 2016Pages 1973–1976https://doi.org/10.1145/2983323.2983887

Published:24 October 2016Publication History

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 1973–1976

ABSTRACT

Previous research on content-based geolocation in general has developed prediction methods via conducting pre-partitioning and applying classification methods. The input of these methods is the concatenation of individual tweets during a period of time. But unfortunately, these methods have some drawbacks. They discard the natural real-values properties of latitude and longitude as well as fail to capture geolocation in near real-time. In this work, we develop a novel generative content-based regression model via a matrix factorization technique to tackle the near real-time geolocation prediction problem. With this model, we aim to address a couple of un-answered questions. First, we prove that near real-time geolocation prediction can be accomplished if we leave out the concatenation. Second, we account the real-values properties of physical coordinates within a regression solution. We apply our model on Twitter datasets as an example to prove the effectiveness and generality. Our experimental results show that the proposed model, in the best scenario, outperforms a set of state-of-the-art regression models including Support Vector Machines and Factorization Machines by a reduction of the median localization error up to 79%.

References

C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Google ScholarDigital Library
J. Eisenstein, A. Ahmed, and E. P. Xing. Sparse additive generative models of text. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1041--1048, 2011.Google Scholar
J. Eisenstein, B. O'Connor, N. A. Smith, and E. P. Xing. A latent variable model for geographic lexical variation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1277--1287. Association for Computational Linguistics, 2010. Google ScholarDigital Library
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarDigital Library
B. Han, P. Cook, and T. Baldwin. Geolocation prediction in social media data by finding location indicative words. Proceedings of COLING 2012: Technical Papers, pages 1045--1062, 2012.Google Scholar
B. Han, P. Cook, and T. Baldwin. Text-based twitter user geolocation prediction. Journal of Artificial Intelligence Research, pages 451--500, 2014. Google ScholarDigital Library
L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In Proceedings of the 21st international conference on World Wide Web, pages 769--778. ACM, 2012. Google ScholarDigital Library
S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology (TIST), 3(3):57, 2012. Google ScholarDigital Library
S. Roller, M. Speriosu, S. Rallapalli, B. Wing, and J. Baldridge. Supervised text-based geolocation using language models on an adaptive grid. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1500--1510. Association for Computational Linguistics, 2012. Google ScholarDigital Library
B. Wing and J. Baldridge. Hierarchical discriminative classification for text-based geolocation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 336--348, 2014.Google ScholarCross Ref
B. P. Wing and J. Baldridge. Simple supervised document geolocation with geodesic grids. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 955--964. Association for Computational Linguistics, 2011. Google ScholarDigital Library

Index Terms

Near Real-time Geolocation Prediction in Twitter Streams via Matrix Factorization Based Regression
1. Information systems
  1. Information systems applications
    1. Collaborative and social computing systems and tools
      1. Blogs
      2. Social networking sites
    2. Spatial-temporal systems
      1. Geographic information systems
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Models of learning

Recommendations

Points-of-interest recommendation based on convolution matrix factorization

A point-of-interest(POI) recommendation aims to mine a user's visiting history and find her/his potentially preferred places. The decision process when choosing a POI is complex and can be influenced by numerous factors, including personal preferences, ...
Read More
Discovering geographical topics in the twitter stream
WWW '12: Proceedings of the 21st international conference on World Wide Web

Micro-blogging services have become indispensable communication tools for online users for disseminating breaking news, eyewitness accounts, individual expression, and protest groups. Recently, Twitter, along with other online social networking services ...
Read More
Geo-Pairwise Ranking Matrix Factorization Model for Point-of-Interest Recommendation
Neural Information Processing
Abstract
Point-of-interest (POI) recommendation that suggests new locations for people to visit is an important application in location-based social networks (LBSNs). Compared with traditional recommendation problems, e.g., movie recommendation, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
October 2016
2566 pages
ISBN:9781450340731
DOI:10.1145/2983323
General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
geolocation
matrix factorization
regression
twitter
Qualifiers
- short-paper
Conference

Acceptance Rates
CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 307
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Near Real-time Geolocation Prediction in Twitter Streams via Matrix Factorization Based Regression

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Points-of-interest recommendation based on convolution matrix factorization

Discovering geographical topics in the twitter stream

Geo-Pairwise Ranking Matrix Factorization Model for Point-of-Interest Recommendation