Abstract
The unabated growth and increasing significance of the World Wide Web has resulted in a flurry of research activity to improve its capacity for serving information more effectively. But at the heart of these efforts lie implicit assumptions about "quality" and "usefulness" of Web resources and services. This observation points towards measurements and models that quantify various attributes of web sites. The science of measuring all aspects of information, especially its storage and retrieval or informetrics has interested information scientists for decades before the existence of the Web. Is Web informetrics any different, or is it just an application of classical informetrics to a new medium? In this article, we examine this issue by classifying and discussing a wide ranging set of Web metrics. We present the origins, measurement functions, formulations and comparisons of well-known Web metrics for quantifying Web graph properties, Web page significance, Web page similarity, search and retrieval, usage characterization and information theoretic properties. We also discuss how these metrics can be applied for improving Web information access and use.
- Albert, R. and Barabasi, A. 2000. Topology of evolving networks: Local events and uncertainty. Phys. Rev. Lett. 84, 56--60.Google Scholar
- Albert, R., Jeong, H., and Barabasi, A. 1999. The diameter of the world wide web. Nature 401, 130--131.Google Scholar
- Barabasi, A. and Albert, R. 1999. Emergence of scaling in random networks. Science 286 (Oct.), 509--512.Google Scholar
- Barabasi, A., Albert, R., and Jeong, A. 1999. Mean-field theory for scale free random networks. Phys. A 272, 173--187.Google Scholar
- Barabasi, A., Albert, R., and Jeong, J. 2000. Scale-free characteristics of random networks: The topology of the world wide web. Phys. A, 281, 69--77.Google Scholar
- Bharat, K. and Broder, A. 1998. A technique for measuring the relative size and overlap of public web search engines. In Proceedings of the 7th International World Wide Web Conference (Australia, Apr.). Google Scholar
- Brewington, B. and Cybenko, G. 2000. How dynamic is the web? In Proceedings of the 9th International World Wide Web Conference (The Netherlands). Google Scholar
- Broder, A., Glassman, S., Manasse, M., and Zweig, G. 1997. Syntactic clustering of the web. In Proceedings of the 6th World Wide Web Conference. Google Scholar
- Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. 2000. Graph structure of the web. In Proceedings of the 9th World Wide Web Conference. Google Scholar
- Boyce, B. R., Meadow, C. T., and Kraft, D. H. 1994. Measurement in Information Science. Academic Press Inc. Orlando, Fla.Google Scholar
- Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th World Wide Web Conference. Google Scholar
- Borodin, A., Roberts, G., Rosenthal, J. S., and Tsaparas, P. 2001. Finding authorities and hubs from link structures on the world wide web. In Proceedings of the 10th International World Wide Web Conference (Hong Kong). Google Scholar
- Botafogo, R., Rivlin, E., and Shneiderman, B. 1992. Structural analysis of hypertexts: Identifying hierarchies and useful metrics. ACM Trans. Inf. Syst. 10, 2 (Apr.), 142--180. Google Scholar
- Bray, T. 1996. Measuring the web. In Proceedings of the 5th International World Wide Web Conference (Paris, France. May). Google Scholar
- Catledge, L. and Pitkow, J. 1995. Characterizing browsing strategies in the world wide web. Comput. Netw. ISDN Syst. 27, 6. Google Scholar
- Chakrabarti, S., Dom, B., Gibson, D., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1998a. Experiments in topic distillation. In Proceedings of the SIGIR Workshop on Hypertext IR.Google Scholar
- Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S., Gibson, D., and Kleinberg, J. 1998b. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the 7th World Wide Web Conference. Google Scholar
- Cho, J., Garcia-Molina, H., and Page, L. 1998. Efficient crawling through url ordering. In Proceedings of the 7th World Wide Web Conference. Google Scholar
- Cohn, D. and Chang, H. 2000. Learning to probabilistically identify authorative documents. In Proceedings of the 17th International Conference on Machine Learning (Calif). Google Scholar
- Dean, J. and Henzinger, M. 1999. Finding related pages in the world wide web. In Proceedings of the 8th World Wide Web Conference. Google Scholar
- Dhyani, D. 2001. Measuring the web: Metrics, models and methods. Master's Dissertation, School of Computer Engineering, Nanyang Technological University, Singapore.Google Scholar
- Egghe, L. and Rousseau, R. 1990. Introduction to Informetrics. Elsevier Science Publishers. Amsterdam, The Netherlands.Google Scholar
- Gibson, D., Kleinberg, J., and Raghavan, P. 1998. Inferring web communities from link topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. Google Scholar
- Hawking, D., Craswell, N., Thislewaite, P., and Harman, D. 1999. Results and challenges in web search evaluation. In Proceedings of the 8th World Wide Web Conference. Google Scholar
- Henzinger, M., Heydon, A., Mitzenmacher, M., and Najork, M. 1999. Measuring index quality using random walks on the web. In Proceedings of the 8th World Wide Web Conference. Google Scholar
- Kleinberg, J. 1998. Authoritative sources in a hyperlinked environment. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. Google Scholar
- Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. The web as a graph: Measurements, models, and methods. In Proceedings of the 5th International Conference on Computing and Combinatorics (COCOON). Google Scholar
- Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. Trawling the web for emerging cyber-communities. In Proceedings of the 8th World Wide Web Conference. Google Scholar
- Larson, R. 1996. Bibliometrics of the world wide web: An exploratory analysis of the intellectual structure of cyberspace. In Annual Meeting of the American Society of Information Science.Google Scholar
- Lawrence, S. and Giles, C. L. 1998. Searching the world wide web. Science 280 (Apr.).Google Scholar
- Lawrence, S. and Giles, C. L. 1999. Searching the web: General and scientific information access. IEEE Commun. 37, 1, 116--122. Google Scholar
- Lee, D., Chuang, H., and Seamons, K. 1997. Effectiveness of document ranking and relevance feedback techniques. IEEE Softw. 14, 2 (Mar./Apr.), 67--75. Google Scholar
- Lempel, R. and Moran, S. 2000. The stochastic approach for link structure analysis (salsa) and the tkc effect. In Proceedings of the 9th World Wide Web Conference. Google Scholar
- Lempel, R. and Soffer, A. 2001. PicASHOW: Pictorial authority search by hyperlinks on the web. In Proceedings of the 10th International World Wide Web Conference (Hong Kong). Google Scholar
- Marchiori, M. 1997. The quest for correct information on the web: Hyper search engines. In Proceedings of the 6th World Wide Web Conference. Google Scholar
- Montgomery, D. C. and Runger, G. C. 1994. Applied Statistics and Probability for Engineers. Wiley, New York.Google Scholar
- Murray, B. H. and Moore, A. 2000. Sizing the internet. White paper. Available from http:// www.cyveillance.com/web/us/downloads/Sizing_the_Internet.pdf (July).Google Scholar
- Perkowitz, M. and Etzioni, O. 1997. Adaptive web sites: An AI challenge. In Proceedings of the 15th International Joint Conference on Artificial Intelligence. Google Scholar
- Perkowitz, M. and Etzioni, O. 1998. Adaptive web sites: Automatically synthesizing web pages. In Proceedings of the 15th National Conference on Artificial Intelligence. Google Scholar
- Perkowitz, M. and Etzioni, O. 1999. Towards adaptive web sites: Conceptual framework and case study. In Proceedings of the 8th World Wide Web Conference. Google Scholar
- Pirolli, P., Pitkow, J., and Rao, R. 1996. Silk from a sow's ear: Extracting usable structures from the web. In Proceedings of the ACM-SIGCHI Conference on Human Factors in Computing. Google Scholar
- Pitkow, J. 1997. In search of reliable usage data on the WWW. In Proceedings of the 6th World Wide Web Conference. Google Scholar
- Pitkow, J. and Pirolli, P. 1997. Life, death and lawfulness on the electronic frontier. In Proceedings of the ACM-SIGCHI Conference on Human Factors in Computing. Google Scholar
- Rafiei, D. and Mendelzon, A. 2000. What is this page known for? computing web page reputations. In Proceedings of the 9th World Wide Web Conference. Google Scholar
- Recker, M. and Pitkow, J. 1996. Predicting document access in large multimedia repositories. ACM Trans. Comput.-Hum. Inter. 3, 4. Google Scholar
- Ross, S. 1983. Stochastic Processes. Wiley, New York.Google Scholar
- Selberg, E. and Etzioni, O. 1995. Multi-service search and comparison using the MetaCrawler. In Proceedings of the 4th International World Wide Web Conference.Google Scholar
- Sarukkai, R. 2000. Link prediction and path analysis using Markov chains. In Proceedings of the 4th World Wide Web Conference. Google Scholar
- Snell, L. 1998. Introduction to Probability. McGraw-Hill International Edition, Englewood Cliffs, N.J.Google Scholar
- Weiss, R., Velez, B., Sheldon, M., Namprempre, C., Szilagyi, P., Duda, A., and Gifford, D. 1996. Hypursuit: A hierarchical network search engine that exploits content-link hypertext clustering. In Proceedings of the 7th ACM Conference on Hypertext. Google Scholar
- Yan, T., Jacobsen, M., Garcia-Molina, H., and Dayal, U. 1996. From user access patterns to dynamic hypertext linking. In Proceedings of the 5th International World Wide Web Conference (France). Google Scholar
- Yuwono, B. and Lee, D. 1996. Search and ranking algorithms for locating resources on the world wide web. In Proceedings of the 12th International Conference on Data Engineering (Mar.). Google Scholar
- Yuwono, B., Lam, S., Ying, J., and Lee, D. 1995. A world wide web resource discovery system. In Proceedings of the 4th International World Wide Web Conference.Google Scholar
Recommendations
Web analytics and metrics: a survey
ICACCI '12: Proceedings of the International Conference on Advances in Computing, Communications and InformaticsThis is a survey paper which gives some different types of Web Analytics metrics and how the data is collected related to these metrics. As we know with the increasing need to meet customer preferences and to understand customer behavior, Web Analytics ...
Web metrics for managing quality and auditing Croatian hotel web sites: cluster analysis
Intensive use of e-business can provide number of opportunities and actual benefits to companies of all activities and sizes. In general, through the use of web sites companies can create global presence and widen business boundaries. Many organizations ...
Web Site Usability, Design, and Performance Metrics
Web sites provide the key interface for consumer use of the Internet. This research reports on a series of three studies that developand validate Web site usability, design and performance metrics, including download delay, navigability, site content, ...
Comments