ACM Home Page
Please provide us with feedback. Feedback
High-quality speech-to-speech translation for computer-aided language learning
Full text PdfPdf (604 KB)
Source ACM Transactions on Speech and Language Processing (TSLP) archive
Volume 3 ,  Issue 2  (July 2006) table of contents
Pages: 1 - 21  
Year of Publication: 2006
ISSN:1550-4875
Authors
Chao Wang  MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Stephanie Seneff  MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 146,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1149290.1149291
What is a DOI?

ABSTRACT

This article describes our research on spoken language translation aimed toward the application of computer aids for second language acquisition. The translation framework is incorporated into a multilingual dialogue system in which a student is able to engage in natural spoken interaction with the system in the foreign language, while speaking a query in their native tongue at any time to obtain a spoken translation for language assistance. Thus the quality of the translation must be extremely high, but the domain is restricted. Experiments were conducted in the weather information domain with the scenario of a native English speaker learning Mandarin Chinese. We were able to utilize a large corpus of English weather-domain queries to explore and compare a variety of translation strategies: formal, example-based, and statistical. Translation quality was manually evaluated on a test set of 695 spontaneous utterances. The best speech translation performance (89.9% correct, 6.1% incorrect, and 4.0% rejected), is achieved by a system which combines the formal and example-based methods, using parsability by a domain-specific Chinese grammar as a rejection criterion.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Baptist, L. and Seneff, S. 2000. Genesis-II: A versatile system for language generation in conversational system applications. In Proceedings of the International Conference on Spoken Language Processing. Beijing, China.
 
4
 
5
 
6
Brown, R. D. 1999. Adding linguistic knowledge to a lexical example-based translation system. In Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation. Chester, England.
 
7
Casacuberta, F., Ney, H., Och, F. J., Vidal, E., Vilar, J. M., Barrachina, S., Garcia-Varea, I., Llorens, D., Martinez, C., Molau, S., Nevado, F., Pastor, M., Pico, D., Sanchis, A., and Tillmann, C. 2004. Some approaches to statistical and finite-state speech-to-speech translation. Comput. Speech Lang. 18, 24--47.
 
8
 
9
Cowan, B. 2004. PLUTO: A preprocessor for multilingual spoken language generation. M.S. thesis, MIT, Cambridge, MA.
 
10
 
11
Gao, Y., Zhou, B., Diao, Z., Sorensen, J., Erdogan, H., and Sarikaya, R. 2002. A trainable approach for multilingual speech-to-speech translation system. In Proceedings of the Human Language Technology Conference. San Diego, CA.
 
12
 
13
Glass, J. 2003. A probabilistic framework for segment-based speech recognition. Comput. Speech Lang. 17, 137--152.
 
14
Godfrey, J., Holliman, E., and McDaniel, J. 1992. Switchboard: Telephone speech corpus for research and development. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. San Francisco, CA, 517--520.
 
15
He, Y. and Young, S. 2003. A data-driven spoken language understanding system. In Proceedings of Automatic Speech Recognition and Understanding. St Thomas, US Virgin Islands.
 
16
 
17
Koehn, P. 2004. Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In Proceedings of the Association for Machine Translation in the Americas. Washington DC.
 
18
 
19
 
20
Lee, J. and Seneff, S. 2004. Translingual grammar induction. In Proceedings of the International Conference on Spoken Language Processing. Jeju Island, Korea.
 
21
 
22
 
23
 
24
Ney, H., Niessen, S., Och, F. J., Sawaf, H., Tillmann, C., and Vogel, S. 2000. Algorithms for statistical translation of spoken language. IEEE Trans. Speech Audio Proces. 8, 1, 24--36.
 
25
 
26
Och, F. J., Tillmann, C., and Ney, H. 1999. Improved alignment models for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. College Park, MD, 20--28.
 
27
 
28
 
29
 
30
 
31
Seneff, S. 1992a. Robust parsing for spoken language systems. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. Vol. 1. San Francisco, CA, 189--192.
 
32
 
33
 
34
Seneff, S., Wang, C., and Hazen, T. J. 2003. Automatic induction of n-gram language models from a natural language grammar. In Proceedings of the Eurospeech. Geneva, Switzerland.
 
35
Seneff, S., Wang, C., and Zhang, J. 2004. Spoken conversational interaction for language learning. In Proceedings of National Language Processing and Speech Technologies in Advanced Language Learning Systems. Venice, Italy.
 
36
 
37
 
38
 
39
 
40
Veale, T. and Way, A. 1997. Gaijin: A template-driven bootstrapping approach to example-based machine translation. In Proceedings of the NMNLP. Sofia, Bulgaria.
 
41
Wang, C. and Seneff, S. 2004. High-quality speech translation for language learning. In Proceedings of Natural Language Processing: National Language Processing and Speech Technologies in Advanced Language Learning Systems. Venice, Italy.
 
42
 
43
 
44
Zue, V., Seneff, S., Glass, J., Polifroni, J., Pao, C., Hazen, T. J., and Hetherington, L. 2000. JUPITER: A telephone-based conversational interface for weather information. IEEE Trans. Speech Audio Process. 8, 1, 85--96.

Collaborative Colleagues:
Chao Wang: colleagues
Stephanie Seneff: colleagues