skip to main content
article

A survey of Web metrics

Published:01 December 2002Publication History
Skip Abstract Section

Abstract

The unabated growth and increasing significance of the World Wide Web has resulted in a flurry of research activity to improve its capacity for serving information more effectively. But at the heart of these efforts lie implicit assumptions about "quality" and "usefulness" of Web resources and services. This observation points towards measurements and models that quantify various attributes of web sites. The science of measuring all aspects of information, especially its storage and retrieval or informetrics has interested information scientists for decades before the existence of the Web. Is Web informetrics any different, or is it just an application of classical informetrics to a new medium? In this article, we examine this issue by classifying and discussing a wide ranging set of Web metrics. We present the origins, measurement functions, formulations and comparisons of well-known Web metrics for quantifying Web graph properties, Web page significance, Web page similarity, search and retrieval, usage characterization and information theoretic properties. We also discuss how these metrics can be applied for improving Web information access and use.

References

  1. Albert, R. and Barabasi, A. 2000. Topology of evolving networks: Local events and uncertainty. Phys. Rev. Lett. 84, 56--60.Google ScholarGoogle Scholar
  2. Albert, R., Jeong, H., and Barabasi, A. 1999. The diameter of the world wide web. Nature 401, 130--131.Google ScholarGoogle Scholar
  3. Barabasi, A. and Albert, R. 1999. Emergence of scaling in random networks. Science 286 (Oct.), 509--512.Google ScholarGoogle Scholar
  4. Barabasi, A., Albert, R., and Jeong, A. 1999. Mean-field theory for scale free random networks. Phys. A 272, 173--187.Google ScholarGoogle Scholar
  5. Barabasi, A., Albert, R., and Jeong, J. 2000. Scale-free characteristics of random networks: The topology of the world wide web. Phys. A, 281, 69--77.Google ScholarGoogle Scholar
  6. Bharat, K. and Broder, A. 1998. A technique for measuring the relative size and overlap of public web search engines. In Proceedings of the 7th International World Wide Web Conference (Australia, Apr.). Google ScholarGoogle Scholar
  7. Brewington, B. and Cybenko, G. 2000. How dynamic is the web? In Proceedings of the 9th International World Wide Web Conference (The Netherlands). Google ScholarGoogle Scholar
  8. Broder, A., Glassman, S., Manasse, M., and Zweig, G. 1997. Syntactic clustering of the web. In Proceedings of the 6th World Wide Web Conference. Google ScholarGoogle Scholar
  9. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. 2000. Graph structure of the web. In Proceedings of the 9th World Wide Web Conference. Google ScholarGoogle Scholar
  10. Boyce, B. R., Meadow, C. T., and Kraft, D. H. 1994. Measurement in Information Science. Academic Press Inc. Orlando, Fla.Google ScholarGoogle Scholar
  11. Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th World Wide Web Conference. Google ScholarGoogle Scholar
  12. Borodin, A., Roberts, G., Rosenthal, J. S., and Tsaparas, P. 2001. Finding authorities and hubs from link structures on the world wide web. In Proceedings of the 10th International World Wide Web Conference (Hong Kong). Google ScholarGoogle Scholar
  13. Botafogo, R., Rivlin, E., and Shneiderman, B. 1992. Structural analysis of hypertexts: Identifying hierarchies and useful metrics. ACM Trans. Inf. Syst. 10, 2 (Apr.), 142--180. Google ScholarGoogle Scholar
  14. Bray, T. 1996. Measuring the web. In Proceedings of the 5th International World Wide Web Conference (Paris, France. May). Google ScholarGoogle Scholar
  15. Catledge, L. and Pitkow, J. 1995. Characterizing browsing strategies in the world wide web. Comput. Netw. ISDN Syst. 27, 6. Google ScholarGoogle Scholar
  16. Chakrabarti, S., Dom, B., Gibson, D., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1998a. Experiments in topic distillation. In Proceedings of the SIGIR Workshop on Hypertext IR.Google ScholarGoogle Scholar
  17. Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S., Gibson, D., and Kleinberg, J. 1998b. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the 7th World Wide Web Conference. Google ScholarGoogle Scholar
  18. Cho, J., Garcia-Molina, H., and Page, L. 1998. Efficient crawling through url ordering. In Proceedings of the 7th World Wide Web Conference. Google ScholarGoogle Scholar
  19. Cohn, D. and Chang, H. 2000. Learning to probabilistically identify authorative documents. In Proceedings of the 17th International Conference on Machine Learning (Calif). Google ScholarGoogle Scholar
  20. Dean, J. and Henzinger, M. 1999. Finding related pages in the world wide web. In Proceedings of the 8th World Wide Web Conference. Google ScholarGoogle Scholar
  21. Dhyani, D. 2001. Measuring the web: Metrics, models and methods. Master's Dissertation, School of Computer Engineering, Nanyang Technological University, Singapore.Google ScholarGoogle Scholar
  22. Egghe, L. and Rousseau, R. 1990. Introduction to Informetrics. Elsevier Science Publishers. Amsterdam, The Netherlands.Google ScholarGoogle Scholar
  23. Gibson, D., Kleinberg, J., and Raghavan, P. 1998. Inferring web communities from link topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. Google ScholarGoogle Scholar
  24. Hawking, D., Craswell, N., Thislewaite, P., and Harman, D. 1999. Results and challenges in web search evaluation. In Proceedings of the 8th World Wide Web Conference. Google ScholarGoogle Scholar
  25. Henzinger, M., Heydon, A., Mitzenmacher, M., and Najork, M. 1999. Measuring index quality using random walks on the web. In Proceedings of the 8th World Wide Web Conference. Google ScholarGoogle Scholar
  26. Kleinberg, J. 1998. Authoritative sources in a hyperlinked environment. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. Google ScholarGoogle Scholar
  27. Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. The web as a graph: Measurements, models, and methods. In Proceedings of the 5th International Conference on Computing and Combinatorics (COCOON). Google ScholarGoogle Scholar
  28. Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. Trawling the web for emerging cyber-communities. In Proceedings of the 8th World Wide Web Conference. Google ScholarGoogle Scholar
  29. Larson, R. 1996. Bibliometrics of the world wide web: An exploratory analysis of the intellectual structure of cyberspace. In Annual Meeting of the American Society of Information Science.Google ScholarGoogle Scholar
  30. Lawrence, S. and Giles, C. L. 1998. Searching the world wide web. Science 280 (Apr.).Google ScholarGoogle Scholar
  31. Lawrence, S. and Giles, C. L. 1999. Searching the web: General and scientific information access. IEEE Commun. 37, 1, 116--122. Google ScholarGoogle Scholar
  32. Lee, D., Chuang, H., and Seamons, K. 1997. Effectiveness of document ranking and relevance feedback techniques. IEEE Softw. 14, 2 (Mar./Apr.), 67--75. Google ScholarGoogle Scholar
  33. Lempel, R. and Moran, S. 2000. The stochastic approach for link structure analysis (salsa) and the tkc effect. In Proceedings of the 9th World Wide Web Conference. Google ScholarGoogle Scholar
  34. Lempel, R. and Soffer, A. 2001. PicASHOW: Pictorial authority search by hyperlinks on the web. In Proceedings of the 10th International World Wide Web Conference (Hong Kong). Google ScholarGoogle Scholar
  35. Marchiori, M. 1997. The quest for correct information on the web: Hyper search engines. In Proceedings of the 6th World Wide Web Conference. Google ScholarGoogle Scholar
  36. Montgomery, D. C. and Runger, G. C. 1994. Applied Statistics and Probability for Engineers. Wiley, New York.Google ScholarGoogle Scholar
  37. Murray, B. H. and Moore, A. 2000. Sizing the internet. White paper. Available from http:// www.cyveillance.com/web/us/downloads/Sizing_the_Internet.pdf (July).Google ScholarGoogle Scholar
  38. Perkowitz, M. and Etzioni, O. 1997. Adaptive web sites: An AI challenge. In Proceedings of the 15th International Joint Conference on Artificial Intelligence. Google ScholarGoogle Scholar
  39. Perkowitz, M. and Etzioni, O. 1998. Adaptive web sites: Automatically synthesizing web pages. In Proceedings of the 15th National Conference on Artificial Intelligence. Google ScholarGoogle Scholar
  40. Perkowitz, M. and Etzioni, O. 1999. Towards adaptive web sites: Conceptual framework and case study. In Proceedings of the 8th World Wide Web Conference. Google ScholarGoogle Scholar
  41. Pirolli, P., Pitkow, J., and Rao, R. 1996. Silk from a sow's ear: Extracting usable structures from the web. In Proceedings of the ACM-SIGCHI Conference on Human Factors in Computing. Google ScholarGoogle Scholar
  42. Pitkow, J. 1997. In search of reliable usage data on the WWW. In Proceedings of the 6th World Wide Web Conference. Google ScholarGoogle Scholar
  43. Pitkow, J. and Pirolli, P. 1997. Life, death and lawfulness on the electronic frontier. In Proceedings of the ACM-SIGCHI Conference on Human Factors in Computing. Google ScholarGoogle Scholar
  44. Rafiei, D. and Mendelzon, A. 2000. What is this page known for? computing web page reputations. In Proceedings of the 9th World Wide Web Conference. Google ScholarGoogle Scholar
  45. Recker, M. and Pitkow, J. 1996. Predicting document access in large multimedia repositories. ACM Trans. Comput.-Hum. Inter. 3, 4. Google ScholarGoogle Scholar
  46. Ross, S. 1983. Stochastic Processes. Wiley, New York.Google ScholarGoogle Scholar
  47. Selberg, E. and Etzioni, O. 1995. Multi-service search and comparison using the MetaCrawler. In Proceedings of the 4th International World Wide Web Conference.Google ScholarGoogle Scholar
  48. Sarukkai, R. 2000. Link prediction and path analysis using Markov chains. In Proceedings of the 4th World Wide Web Conference. Google ScholarGoogle Scholar
  49. Snell, L. 1998. Introduction to Probability. McGraw-Hill International Edition, Englewood Cliffs, N.J.Google ScholarGoogle Scholar
  50. Weiss, R., Velez, B., Sheldon, M., Namprempre, C., Szilagyi, P., Duda, A., and Gifford, D. 1996. Hypursuit: A hierarchical network search engine that exploits content-link hypertext clustering. In Proceedings of the 7th ACM Conference on Hypertext. Google ScholarGoogle Scholar
  51. Yan, T., Jacobsen, M., Garcia-Molina, H., and Dayal, U. 1996. From user access patterns to dynamic hypertext linking. In Proceedings of the 5th International World Wide Web Conference (France). Google ScholarGoogle Scholar
  52. Yuwono, B. and Lee, D. 1996. Search and ranking algorithms for locating resources on the world wide web. In Proceedings of the 12th International Conference on Data Engineering (Mar.). Google ScholarGoogle Scholar
  53. Yuwono, B., Lam, S., Ying, J., and Lee, D. 1995. A world wide web resource discovery system. In Proceedings of the 4th International World Wide Web Conference.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader