skip to main content
10.1145/2684822.2685314acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Will This Paper Increase Your h-index?: Scientific Impact Prediction

Published:02 February 2015Publication History

ABSTRACT

Scientific impact plays a central role in the evaluation of the output of scholars, departments, and institutions. A widely used measure of scientific impact is citations, with a growing body of literature focused on predicting the number of citations obtained by any given publication. The effectiveness of such predictions, however, is fundamentally limited by the power-law distribution of citations, whereby publications with few citations are extremely common and publications with many citations are relatively rare. Given this limitation, in this work we instead address a related question asked by many academic researchers in the course of writing a paper, namely: "Will this paper increase my h-index?" Using a real academic dataset with over 1.7 million authors, 2 million papers, and 8 million citation relationships from the premier online academic service ArnetMiner, we formalize a novel scientific impact prediction problem to examine several factors that can drive a paper to increase the primary author's h-index. We find that the researcher's authority on the publication topic and the venue in which the paper is published are crucial factors to the increase of the primary author's h-index, while the topic popularity and the co-authors' h-indices are of surprisingly little relevance. By leveraging relevant factors, we find a greater than 87.5% potential predictability for whether a paper will contribute to an author's h-index within five years. As a further experiment, we generate a self-prediction for this paper, estimating that there is a 76% probability that it will contribute to the h-index of the co-author with the highest current h-index in five years. We conclude that our findings on the quantification of scientific impact can help researchers to expand their influence and more effectively leverage their position of "standing on the shoulders of giants."

References

  1. M. Ahmed, S. Spagna, F. Huici, and S. Niccolini. A peek into the future: Predicting the evolution of popularity in user generated content. In WSDM '13, pages 607--616. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bethard and D. Jurafsky. Who should I cite: Learning literature search models from citation behavior. In CIKM '10, pages 609--618. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Castillo, D. Donato, and A. Gionis. Estimating the number of citations using author reputation. In SPIRE '07, pages 107--117. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Cheng, L. Adamic, P. A. Dow, J. M. Kleinberg, and J. Leskovec. Can cascades be predicted? In WWW '14, pages 925--936, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Garfield. Citation indexes for science: A new dimension in documentation through association of ideas. Science, 122(3159):108--111, 1955.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. Gehrke, P. Ginsparg, and J. M. Kleinberg. Overview of the 2003 kdd cup. SIGKDD Explorations, 5(2):149--151, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. E. Hirsch. An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences, 102(46):16569--16572, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  9. L. Hong, A. S. Doumith, and B. D. Davison. Co-factorization machines: Modeling user interests and predicting individual decisions in Twitter. In WSDM '13, pages 557--566. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. M. Kleinberg and S. Oren. Mechanisms for (mis)allocating scientific credit. In STOC '11, pages 529--538. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Machematical Statistics, 22(1):79--86, 1951.Google ScholarGoogle ScholarCross RefCross Ref
  13. L. Liu, J. Tang, J. Han, M. Jiang, and S. Yang. Mining topic-level influence in heterogeneous networks. In CIKM '10, pages 199--208. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Pinto, J. M. Almeida, and M. A. Gonçalves. Using early view patterns to predict the popularity of youtube videos. In WSDM '13, pages 365--374. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Radicchi, S. Fortunato, and C. Castellano. Universality of citation distributions: Toward an objective measure of scientific impact. PNAS, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  16. X. Ren, J. Liu, X. Yu, U. Khandelwal, Q. Gu, L. Wang, and J. Han. ClusCite: Effective citation recommendation by information network-based clustering. In KDD '14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H.-W. Shen and A.-L. Barabási. Collective credit allocation in science. PNAS, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  18. H.-W. Shen, D. Wang, C. Song, and A.-L. Barabási. Modeling and predicting popularity dynamics via reinforced poisson processes. In AAAI '14, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Strathern. Improving ratings: audit in the British university system. European Review, 5(03):305--321, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  20. Y. Sun, J. Han, C. C. Aggarwal, and N. V. Chawla. When will it happen?: Relationship prediction in heterogeneous information networks. In WSDM '12, pages 663--672. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD '09, pages 807--816, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Tang and J. Zhang. A discriminative approach to topic-based citation recommendation. Advances in Knowledge Discovery and Data Mining, pages 572--579, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD '08, pages 990--998, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Uzzi, S. Mukherjee, M. Stringer, and B. Jones. Atypical combinations and scientific impact. Science, 342(6157):468--472, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  25. D. Vu, A. Asuncion, D. Hunter, and P. Smyth. Dynamic egocentric models for citation networks. In ICML '11, pages 857--864, 2011.Google ScholarGoogle Scholar
  26. C. Wang, J. Han, Y. Jia, J. Tang, D. Zhang, Y. Yu, and J. Guo. Mining advisor-advisee relationships from research publication networks. In KDD '10, pages 203--212, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Wang, C. Song, and A.-L. Barabási. Quantifying long-term scientific impact. Science, 342(6154):127--132, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  28. R. Yan, C. Huang, J. Tang, Y. Zhang, and X. Li. To better stand on the shoulder of giants. In JCDL '12, pages 51--60. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Yan, J. Tang, X. Liu, D. Shan, and X. Li. Citation count prediction: Learning to estimate future citations for literature. In CIKM '11, pages 1247--1252. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Yu, Q. Gu, M. Zhou, and J. Han. Citation prediction in heterogeneous bibliographic networks. In SDM '12, pages 1119--1130, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  31. J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA '07, pages 1066--1069, 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Will This Paper Increase Your h-index?: Scientific Impact Prediction

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining
          February 2015
          482 pages
          ISBN:9781450333177
          DOI:10.1145/2684822

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 February 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          WSDM '15 Paper Acceptance Rate39of238submissions,16%Overall Acceptance Rate498of2,863submissions,17%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader