research-article

Learning from the past: answering new questions with past answers

Authors:

Idan SzpektorAuthors Info & Claims

WWW '12: Proceedings of the 21st international conference on World Wide Web

Pages 759 - 768

https://doi.org/10.1145/2187836.2187939

Published: 16 April 2012 Publication History

Abstract

Community-based Question Answering sites, such as Yahoo! Answers or Baidu Zhidao, allow users to get answers to complex, detailed and personal questions from other users. However, since answering a question depends on the ability and willingness of users to address the asker's needs, a significant fraction of the questions remain unanswered. We measured that in Yahoo! Answers, this fraction represents 15% of all incoming English questions. At the same time, we discovered that around 25% of questions in certain categories are recurrent, at least at the question-title level, over a period of one year.

We attempt to reduce the rate of unanswered questions in Yahoo! Answers by reusing the large repository of past resolved questions, openly available on the site. More specifically, we estimate the probability whether certain new questions can be satisfactorily answered by a best answer from the past, using a statistical model specifically trained for this task. We leverage concepts and methods from query-performance prediction and natural language processing in order to extract a wide range of features for our model. The key challenge here is to achieve a level of quality similar to the one provided by the best human answerers.

We evaluated our algorithm on offline data extracted from Yahoo! Answers, but more interestingly, also on online data by using three "live" answering robots that automatically provide past answers to new questions when a certain degree of confidence is reached. We report the success rate of these robots in three active Yahoo! Answers categories in terms of both accuracy, coverage and askers' satisfaction. This work presents a first attempt, to the best of our knowledge, of automatic question answering to questions of social nature, by reusing past answers of high quality.

References

[1]

E. Agichtein, S. Lawrence, and L. Gravano. Learning search engine specific query transformations for question answering. In WWW, 2001.

Digital Library

[2]

E. Agichtein, Y. Liu, and J. Bian. Modeling information-seeker satisfaction in community question answering. ACM Trans. Knowl. Discov. Data, 3, 2009.

Digital Library

[3]

D. Bernhard and I. Gurevych. Combining lexical semantic resources with question & answer archives for translation-based answer finding. In ACL, 2009.

Digital Library

[4]

J. Bian, Y. Liu, E. Agichtein, and H. Zha. Finding the right facts in the crowd: factoid question answering over social media. In WWW, 2008.

Digital Library

[5]

D. M. Blei, A. Y. Ng, M. I. Jordan, and J. Lafferty. Latent dirichlet allocation. Journal of Machine Learning Research, 3:2003, 2003.

Digital Library

[6]

L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001.

Digital Library

[7]

D. Carmel, M. Shtalhaim, and A. Soffer. eresponder: Electronic question responder. In CooplS, 2000.

Digital Library

[8]

A. Corrada-Emmanuel, W. B. Croft, and V. Murdock. Answer passage retrieval for question answering, 2003.

[9]

S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Precision prediction based on ranked list coherence. Information Retrieval, 9(6):723--755, 2006.

Digital Library

[10]

M.-C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed dependency parses from phrase structure parses. In LREC, 2006.

[11]

G. Dror, Y. Koren, Y. Maarek, and I. Szpektor. I want to answer; who has a question?: Yahoo! answers recommender system. In KDD, 2011.

Digital Library

[12]

H. Duan, Y. Cao, C.-Y. Lin, and Y. Yu. Searching questions by identifying question topic and question focus. In ACL, 2008.

[13]

J. L. Fleiss. Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5):378 -- 382, 1971.

[14]

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. SIGKDD Explor. Newsl., 11(1):10--18, 2009.

Digital Library

[15]

C. Hauff, D. Hiemstra, and F. de Jong. A survey of pre-retrieval query performance predictors. In CIKM, 2008.

Digital Library

[16]

B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In SPIRE, 2004.

[17]

D. Horowitz and S. Kamvar. The anatomy of a large-scale social search engine. In WWW, 2010.

Digital Library

[18]

J. Jeon, W. B. Croft, and J. H. Lee. Finding semantically similar questions based on their answers. In SIGIR, 2005.

Digital Library

[19]

J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM, 2005.

Digital Library

[20]

R. Kohavi. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In IJCAI, 1995.

Digital Library

[21]

B. Li and I. King. Routing questions to appropriate answerers in community question answering services. In CIKM, 2010.

Digital Library

[22]

X. Liu and W. B. Croft. Passage retrieval based on language models. In CIKM, 2002.

Digital Library

[23]

E. Mendes Rodrigues and N. Milic-Frayling. Socializing or knowledge sharing?: characterizing social intent in community question answering. In CIKM, 2009.

Digital Library

[24]

J. M. Prager. Open-domain question-answering. Foundations and Trends in Information Retrieval, 1(2):91--231, 2006.

Digital Library

[25]

I. Roberts and R. Gaizauskas. Evaluating passage retrieval approaches for question answering. In S. McDonald and J. Tait, editors, Advances in Information Retrieval, volume 2997 of Lecture Notes in Computer Science, pages 72--84. Springer Berlin / Heidelberg, 2004.

[26]

J. Sim and C. C. Wright. The kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Physical Therapy, March 2005.

[27]

R. Soricut and E. Brill. Automatic question answering: Beyond the factoid. In HLT-NAACL, 2004.

[28]

T. Strzalkowski and S. Harabagiu. Advances in Open Domain Question Answering (Text, Speech and Language Technology). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.

Digital Library

[29]

M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online QA collections. In HLT-ACL, 2008.

[30]

S. Tellex, B. Katz, J. Lin, A. Fernandes, and G. Marton. Quantitative evaluation of passage retrieval algorithms for question answering. In SIGIR, 2003.

Digital Library

[31]

A. Tsotsis. Just because google exists doesn't mean you should stop asking people things, October 2010. Techcrunch.

[32]

E. M. Voorhees. The trec-8 question answering track report. In Text REtrieval Conference, 1999.

[33]

E. M. Voorhees. Overview of the trec 2003 question answering track. In Text REtrieval Conference, 2003.

[34]

K. Wang, Z. Ming, and T.-S. Chua. A syntactic tree matching approach to finding similar questions in community-based qa services. In SIGIR, 2009.

Digital Library

[35]

X. Xue, J. Jeon, and W. B. Croft. Retrieval models for question and answer archives. In SIGIR, 2008.

Digital Library

[36]

Y. Zhou and W. B. Croft. Query performance prediction in web search environments. In SIGIR, 2007.

Digital Library

Cited By

Shi YHaller AReeson ALi XLi C(2024)Investigating the effects of nudges to promote knowledge-sharing behaviours on MOOC forums: a mixed method designBehaviour & Information Technology10.1080/0144929X.2024.231628744:2(289-314)Online publication date: 16-Feb-2024
https://doi.org/10.1080/0144929X.2024.2316287
N KDeepak G(2021)KnowSum: Knowledge Inclusive Approach for Text Summarization Using Semantic Allignment2021 7th International Conference on Web Research (ICWR)10.1109/ICWR51868.2021.9443149(227-231)Online publication date: 19-May-2021
https://doi.org/10.1109/ICWR51868.2021.9443149
Awati CShirgave SRaval V(2021)Accurate Answers Selection and Expert Recommendation in Community Question Answers System2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS)10.1109/ICICCS51141.2021.9432089(1171-1174)Online publication date: 6-May-2021
https://doi.org/10.1109/ICICCS51141.2021.9432089
Show More Cited By

Index Terms

Learning from the past: answering new questions with past answers
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering

Recommendations

Exploring heterogeneous features for query-focused summarization of categorized community answers

Community-based question answering (cQA) is a popular type of online knowledge-sharing web service where users ask questions and obtain answers contributed by others. To enhance knowledge sharing, cQA also provides users with a retrieval function to ...
Novelty based Ranking of Human Answers for Community Questions
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Questions and their corresponding answers within a community based question answering (CQA) site are frequently presented as top search results forWeb search queries and viewed by millions of searchers daily. The number of answers for CQA questions ...
Tapping on the potential of q&a community by recommending answer providers
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

The rapidly increasing popularity of community-based Question Answering (cQA) services, e.g. Yahoo! Answers, Baidu Zhidao, etc. have attracted great attention from both academia and industry. Besides the basic problems, like question searching and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '12: Proceedings of the 21st international conference on World Wide Web

April 2012

1078 pages

ISBN:9781450312295

DOI:10.1145/2187836

General Chairs:
Alain Mille
Université de Lyon, France
,
Fabien Gandon
INRIA, France
,
Jacques Misselis
HP, France
,
Program Chairs:
Michael Rabinovich
Case Western Reserve University, USA
,
Steffen Staab
University of Koblenz-Landau, Germany

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Univ. de Lyon: Universite de Lyon

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 April 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW 2012

Sponsor:

Univ. de Lyon

WWW 2012: 21st World Wide Web Conference 2012

April 16 - 20, 2012

Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

61
Total Citations
View Citations
1,156
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shi YHaller AReeson ALi XLi C(2024)Investigating the effects of nudges to promote knowledge-sharing behaviours on MOOC forums: a mixed method designBehaviour & Information Technology10.1080/0144929X.2024.231628744:2(289-314)Online publication date: 16-Feb-2024
https://doi.org/10.1080/0144929X.2024.2316287
N KDeepak G(2021)KnowSum: Knowledge Inclusive Approach for Text Summarization Using Semantic Allignment2021 7th International Conference on Web Research (ICWR)10.1109/ICWR51868.2021.9443149(227-231)Online publication date: 19-May-2021
https://doi.org/10.1109/ICWR51868.2021.9443149
Awati CShirgave SRaval V(2021)Accurate Answers Selection and Expert Recommendation in Community Question Answers System2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS)10.1109/ICICCS51141.2021.9432089(1171-1174)Online publication date: 6-May-2021
https://doi.org/10.1109/ICICCS51141.2021.9432089
Miraclin Joyce Pamila JK ASelvi R(2021)Natural language processing based identification of Related Short Forum Posts Through Knowledge Based Conceptualization2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS)10.1109/ICAIS50930.2021.9396051(1733-1740)Online publication date: 25-Mar-2021
https://doi.org/10.1109/ICAIS50930.2021.9396051
Song JXu XWang X(2021)TSAR-based Expert Recommendation Mechanism for Community Question Answering2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD49262.2021.9437843(162-167)Online publication date: 5-May-2021
https://doi.org/10.1109/CSCWD49262.2021.9437843
Alfeo ACimino MVaglini G(2021)Technological troubleshooting based on sentence embedding with deep transformersJournal of Intelligent Manufacturing10.1007/s10845-021-01797-w32:6(1699-1710)Online publication date: 7-Jun-2021
https://doi.org/10.1007/s10845-021-01797-w
Pankajakshan VSridevi M(2021)Detecting Duplicate Question Pairs Using GloVe Embeddings and Similarity MeasuresAdvances in Automation, Signal Processing, Instrumentation, and Control10.1007/978-981-15-8221-9_63(695-702)Online publication date: 5-Mar-2021
https://doi.org/10.1007/978-981-15-8221-9_63
Hotaling ABagrow J(2020)Efficient crowdsourcing of crowd-generated microtasksPLOS ONE10.1371/journal.pone.024424515:12(e0244245)Online publication date: 17-Dec-2020
https://doi.org/10.1371/journal.pone.0244245
Filice SCohen NCarmel D(2020)Voice-based Reformulation of Community AnswersProceedings of The Web Conference 202010.1145/3366423.3380053(2885-2891)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366423.3380053
Zahedi MRahgozar MZoroofi R(2020)HCA: Hierarchical Compare Aggregate model for question retrieval in community question answeringInformation Processing & Management10.1016/j.ipm.2020.10231857:6(102318)Online publication date: Nov-2020
https://doi.org/10.1016/j.ipm.2020.102318
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten