skip to main content
10.1145/1935826.1935886acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
poster

What blogs tell us about websites: a demographics study

Published: 09 February 2011 Publication History

Abstract

One challenge for content providers on the Web is determining who consumes their content. For instance, online newspapers want to know who is reading their articles. Previous approaches have tried to determine such audience demographics by placing cookies on users' systems, or by directly asking consumers (e.g., through surveys). The first approach may make users uncomfortable, and the second is not scalable. In this paper we focus on determining the demographics of a Website's audience by analyzing the blogs that link to the Website. We analyze both the text of the blogs and the network connectivity of the blog network to determine demographics such as whether a person "is married" or "has pets." Presumably bloggers linking to sites also consume the content of those sites. Therefore, the discovered demographics for the bloggers can be used to represent a proxy set of demographics for a subset of the Website's consumers. We demonstrate that in many cases we can infer sub-audiences for a site from these demographics. Further, this feasibility demonstrates that very specific demographics for sites can be generated as we improve the methods for determining them (e.g., finding people who play video games). In our study we analyze blogs collected from more than 590,000 bloggers collected over a six month period that link to more than 488,000 distinct, external websites.

References

[1]
M. S. Ackerman and L. F. C. J. Reagle. Privacy in e-commerce: Examining user scenarios and privacy preferences. In Proceedings of the ACM Conference on Electronic Commerce. ACM, 1999.
[2]
B. Berendt, O. Günther, and S. Spiekermann. Privacy in e-commerce: Stated preferences vs. actual behavior. Communications of the ACM, 48(4):101--106, 2005.
[3]
S. Bhagat, I. Rozenbaum, and G. Cormode. Applying link-based classification to label blogs. In Proc. of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 92--101, New York, NY, USA, 2007. ACM.
[4]
M. Dutta-Bergman. Trusted online sources of health information: Differences in demographics, health beliefs, and health-information orientation. J Med Internet Res, 5, 2003.
[5]
M. Eirinaki and M. Vazirgiannis. Web mining for web personalization. ACM Trans. Internet Technol., 3(1):1--27, 2003.
[6]
D. L. Hoffman, T. P. Novak, and M. Peralta. Building consumer trust online. Commun. ACM, 42(4):80--85, 1999.
[7]
M. Koppel, J. Schler, and K. Zigdon. Determining an author's native language by mining a text for errors. In KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 624--628, New York, NY, USA, 2005. ACM.
[8]
H. Li, B. H. Detenber, W. P. Lee, and S. Chia. E-government in singapore: Demographics, usage patterns, and perceptions. Journal of E-Goverment, 1:29--54, 2004.
[9]
M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology, 27(1):415--444, 2001.
[10]
L. I. Millett, B. Friedman, and E. W. Felten. Cookies and web browser design: toward realizing informed consent online. In Proc. of CHI, pages 46--52, 2001.
[11]
B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Effective personalization based on association rule discovery from web usage data. In WIDM '01: Proceedings of the 3rd international workshop on Web information and data management, pages 9--15, New York, NY, USA, 2001. ACM.
[12]
J. Schler, M. Koppel, S. Argamon, and J. Pennebaker. Effects of age and gender on blogging. In Proc. of AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs, 2006.
[13]
J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor. Newsl., 1(2):12--23, 2000.
[14]
D. Yates, M. Shute, and D. Rotman. Connecting the dots: When personal information becomes personally identifying on the internet. In Proceedings of the International Conference Weblogs and Social Media (ICWSM), 2010.

Cited By

View all
  • (2017)User Modeling on Demographic Attributes in Big Mobile Social NetworksACM Transactions on Information Systems10.1145/305727835:4(1-33)Online publication date: 11-Jul-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining
February 2011
870 pages
ISBN:9781450304931
DOI:10.1145/1935826
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 February 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. audience discovery
  2. discovery of demographics

Qualifiers

  • Poster

Conference

Acceptance Rates

WSDM '11 Paper Acceptance Rate 83 of 372 submissions, 22%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)User Modeling on Demographic Attributes in Big Mobile Social NetworksACM Transactions on Information Systems10.1145/305727835:4(1-33)Online publication date: 11-Jul-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media