ABSTRACT
The shape of the Web in terms of its graphical structure has been a widely interested topic. Two graphs, Bow Tie and Daisy, have stood out from previous research. In this work, we take a different approach, by viewing the Web as a hierarchy of three levels, namely page level, host level, and domain level. Such structures are analyzed and compared with a snapshot of Chinese Web in early 2006, involving 830 million pages, 17 million hosts, and 0.8 million domains. Some interesting results have emerged. For example, the Chinese Web appears more like a teapot (with a large size of SCC, a medium size of IN and a small size of OUT) at page level than the classic bow tie or daisy shape. Some challenging phenomena are also observed. For example, the INs become much smaller than OUTs at host and domain levels. Future work will tackle these puzzles.
- Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A. & Wiener, J. (2000). Graph structure in the web. Computer Networks, 33(1--6), 309--320. Google ScholarDigital Library
- Donato, D. Leonardi, S., Millozzi, S., & Tsaparas, P. Mining the inner structure of the Web graph. Eighth International Workshop on the Web and Databases (WebDB 2005), June 16--17, 2005, Baltimore, Maryland.Google Scholar
- Liu, G., Yu, H., Han, J. & Xue, G. (2005). China web graph measurements and evolution. In Y. Zhang et al. (Eds.): APWeb 2005, LNCS 3399, 668--679. Google ScholarDigital Library
Index Terms
- A teapot graph and its hierarchical structure of the chinese web
Recommendations
Graph structure in the web --- revisited: a trick of the heavy tail
WWW '14 Companion: Proceedings of the 23rd International Conference on World Wide WebKnowledge about the general graph structure of the World Wide Web is important for understanding the social mechanisms that govern its growth, for designing ranking methods, for devising better crawling algorithms, and for creating accurate models of ...
Graph structure in the web: aggregated by pay-level domain
WebSci '14: Proceedings of the 2014 ACM conference on Web sciencePrevious research on the overall graph structure of the World Wide Web mostly focused on the page level, meaning that the graph that directly results from hyperlinks between individual web pages was analyzed. This paper aims to provide additional ...
Structure of the Thai Web Graph
AINAW '08: Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - WorkshopsThis paper presents structural properties of the Thai Web graph. We conduct an empirical study on the Webgraphs induced from two Thai web snapshots crawled during January 2007 (5.7M nodes and 12M directed edges) and May 2007 (18.8M nodes and 70M ...
Comments