skip to main content
10.1145/1141753.1141765acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
Article

Bibliometric impact measures leveraging topic analysis

Published: 11 June 2006 Publication History

Abstract

Measurements of the impact and history of research literature provide a useful complement to scientific digital library collections. Bibliometric indicators have been extensively studied, mostly in the context of journals. However, journal-based metrics poorly capture topical distinctions in fast-moving fields, and are increasingly problematic with the rise of open-access publishing. Recent developments in latent topic models have produced promising results for automatic sub-field discovery. The fine-grained, faceted topics produced by such models provide a clearer view of the topical divisions of a body of research literature and the interactions between those divisions. We demonstrate the usefulness of topic models in measuring impact by applying a new phrase-based topic discovery model to a collection of 300,000 Computer Science publications, collected by the Rexa automatic citation indexing system.

References

[1]
D. W. Aksnes, T. B. Olsen, and P. O. Seglen. Validation of bibliometric indicators in the field of microbiology: A norwegian case study. Scientometrics, 49(1):7--22, 2000.
[2]
D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[3]
K. Börner, C. Chen, and K. W. Boyack. Visualizing knowledge domains. Annual Review of Information Science and Technology, 37, 2003.
[4]
K. Börner, A. Dillon, and M. Dolinsky. LVis—digital library visualizer. In Information Visualization 2000, symposium on Digital Libraries, pages 77--81, 2000.
[5]
Q. L. Burrell. The use of the generalized Waring process in modelling informetric data. Scientometrics, 64(3):247--270, 2005.
[6]
C. Chen. CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. JASIST, 2006.
[7]
M. Christopherson. Identifying core documents with a multiple evidence relevance filter. Scientometrics, 61(3):385--394, 2004.
[8]
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41 (6):391--407, 1990.
[9]
L. Egghe and R. Rousseau. Introduction to Informetrics: quantitative methods in library, documentation, and information science. 1990.
[10]
E. Erosheva, S. Fienberg, and J. Lafferty. Mixed-membership models of scientific publications. PNAS, 101(Suppl. 1):5220--5227, 2004.
[11]
E. Garfield. Expected citation rates, half-life, and impact ratios: comparing apples to apples in evaluation research. Current Contents, 1994.
[12]
E. Garfield. Historiographic mapping of knowledge domains literature. Journal of Information Science, 30(2):119--145, 2004.
[13]
E. Garfield. The history and meaning of the journal impact factor. Journal of the American Medical Association, 293:90--93, January 2006.
[14]
C. Giles, K. Bollacker, and S. Lawrence. Citeseer: An automatic citation indexing system. In DL'98 Digital Libraries, 3rd ACM Conference on Digital Libraries, pages 89--98, 1998.
[15]
W. Glänzel. Towards a model for diachonous and synchronous citation analyses. Scientometrics, 60(3):511--522, 2004.
[16]
P. Glenisson, W. Glänzel, and O. Persson. Combining full-text analysis and bibliometric indicators. a pilot study. Scientometrics, 63(1):163--180, 2005.
[17]
A. Goodrum, K. W. McCain, S. Lawrence, and C. L. Giles. Scholarly publishing in the internet age: a citation analysis of computer science literature. Information Processing and Management, 37(5):661--675, 2001.
[18]
R. Klavans and K. W. Boyack. Identifying a better measure of relatedness for mapping science. JASIST, 57(2):251--263, 2006.
[19]
A. McCallum, A. Corrada-Emanuel, and X. Wang. Topic and role discovery in social networks. In International Joint Conference on Artificial Intelligence (IJCAI), 2005.
[20]
A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval, 3:127, 2000.
[21]
F. Peng and A. McCallum. Accurate information extraction from research papers using conditional random fields. In HLT-NAACL, 2004.
[22]
M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Conference on Uncertainty in Artificial Intelligence (UAI), 2004.
[23]
I. Rowlands. Journal diffusion factors: a new approach to measuring research influence. Journal of Documentation, 54:77--84, 2002.
[24]
H. S. Sichel. A bibliometric distribution which really works. JASIS, 36(5):314--321, 1985.
[25]
H. Small. A passage through science: crossing disciplinary boundaries. Library Trends, 48(1):72--108, 1999.
[26]
X. Wang and A. McCallum. A note on topical n-grams. Technical Report UM-CS-2005-071, University of Massachusetts, Amherst, December 2005.
[27]
B. Wellner, A. McCallum, F. Peng, and M. Hay. An integrated, conditional model of information extraction and coreference with application to citation matching. In Conference on Uncertainty in Artificial Intelligence (UAI), 2004.

Cited By

View all
  • (2025)Topic modelling through the bibliometrics lens and its techniqueArtificial Intelligence Review10.1007/s10462-024-11011-x58:3Online publication date: 6-Jan-2025
  • (2024)Identification of Emerging Technological Hotspots from a Multi-Source Information Perspective: Case Study on Blockchain Financial TechnologyInformation10.3390/info1509058115:9(581)Online publication date: 19-Sep-2024
  • (2024)Analysis of Hot Topics and Evolution of Research in World-class Agricultural Universities Based on BERTopicApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-03279:1Online publication date: 26-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '06: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
June 2006
402 pages
ISBN:1595933549
DOI:10.1145/1141753
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2006

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

JCDL06
JCDL06: Joint Conference on Digital Libraries 2006
June 11 - 15, 2006
NC, Chapel Hill, USA

Acceptance Rates

Overall Acceptance Rate 415 of 1,482 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)4
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Topic modelling through the bibliometrics lens and its techniqueArtificial Intelligence Review10.1007/s10462-024-11011-x58:3Online publication date: 6-Jan-2025
  • (2024)Identification of Emerging Technological Hotspots from a Multi-Source Information Perspective: Case Study on Blockchain Financial TechnologyInformation10.3390/info1509058115:9(581)Online publication date: 19-Sep-2024
  • (2024)Analysis of Hot Topics and Evolution of Research in World-class Agricultural Universities Based on BERTopicApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-03279:1Online publication date: 26-Feb-2024
  • (2024)W stronę nowej metodologii analizy treści. Podobieństwa i różnice pomiędzy modelowaniem tematycznym i jakościową analizą treściToward a New Methodology for Content Analysis: Similarities and Differences Between Topic Modeling and Qualitative Content AnalysisPrzegląd Socjologii Jakościowej10.18778/1733-8069.20.4.0620:4(118-143)Online publication date: 30-Nov-2024
  • (2024)COVID-19 and teachers’ digital competencies: a comprehensive bibliometric and topic modeling analysisHumanities and Social Sciences Communications10.1057/s41599-024-04335-011:1Online publication date: 27-Dec-2024
  • (2024)Bibliometric Analysis of European Integration and Globalization: New Challenges in the Energy SectorEurope in the New World Economy: Opportunities and Challenges10.1007/978-3-031-71329-3_34(557-573)Online publication date: 26-Nov-2024
  • (2023)Text Sentiment Classification Based on BERT Embedding and Sliced Multi-Head Self-Attention Bi-GRUSensors10.3390/s2303148123:3(1481)Online publication date: 28-Jan-2023
  • (2023)Analysis of research dynamics in sport management using topic modellingManaging Sport and Leisure10.1080/23750472.2023.2200449(1-22)Online publication date: 15-Apr-2023
  • (2023)Freight last mile delivery: a literature reviewTransportation Planning and Technology10.1080/03081060.2023.226860147:3(323-369)Online publication date: 11-Oct-2023
  • (2022)Transportation data visualization with a focus on freight: a literature reviewTransportation Planning and Technology10.1080/03081060.2022.211143045:4(358-401)Online publication date: 11-Aug-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media