ACM Home Page
Please provide us with feedback. Feedback
Learning random walk models for inducing word dependency distributions
Full text PdfPdf (177 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 103  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Kristina Toutanova  Stanford University, Stanford, CA
Christopher D. Manning  Stanford University, Stanford, CA
Andrew Y. Ng  Stanford University, Stanford, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 47,   Citation Count: 12
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015442
What is a DOI?

ABSTRACT

Many NLP tasks rely on accurately estimating word dependency probabilities P(ω12), where the words w1 and w2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge are important challenges. For example, if the probability P(N|V) of noun N being the subject of verb V is high, and V takes similar objects to V', and V' is synonymous to V", then we want to conclude that P(N|V") should also be reasonably high---even when those words did not cooccur in the training data.To capture these higher order relationships, we propose a Markov chain model, whose stationary distribution is used to give word probability estimates. Unlike the manually defined random walks used in some link analysis algorithms, we show how to automatically learn a rich set of parameters for the Markov chain's transition probabilities. We apply this model to the task of prepositional phrase attachment, obtaining an accuracy of 87.54%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bikel, D. M. (2003). Intricacies of Collins' parsing model (Technical Report TR No. MS-CIS-03-11). University of Pennsylvania.
 
2
Brémaud, P. (1999). Markov chains: Gibbs fields, monte carlo simulation, and queues. Springer-Verlag.
 
3
 
4
 
5
Charniak, E. (1997). Statistical parsing with a context-free grammar and word statistics. Proc. 14th National Conference on Artificial Intelligence (pp. 598--603).
 
6
 
7
 
8
Collins, M., & Brooks, J. (1995). Prepositional attachment through a backed-off model. Proceedings of the Third Workshop on Very Large Corpora (pp. 27--38).
 
9
 
10
Essen, U., & Steinbiss, V. (1992). Cooccurrence smoothing for stochastic language modeling. ICASSP (pp. 161--164).
 
11
Goodman, J. T. (2001). A bit of progress in language modeling. MSR Technical Report MSR-TR-2001-72.
 
12
 
13
 
14
 
15
16
 
17
 
18
Ng, A. Y., Zheng, A. X., & Jordan, M. (2001). Link analysis, eigenvectors, and stability. Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01).
 
19
 
20
Rao, R. C. (1982). Diversity: Its measurement, decomposition, aportionment and analysis. The Indian Journal of Statistics, 44, 1--22.
 
21
 
22
Stetina, J., & Nagao, M. (1997). Corpus based PP attachment ambiguity resolution with a semantic dictionary. Proc. 5th Workshop on Very Large Corpora (pp. 66--80).

CITED BY  12
 
 
 
 
 
Collaborative Colleagues:
Kristina Toutanova: colleagues
Christopher D. Manning: colleagues
Andrew Y. Ng: colleagues