skip to main content
research-article

Two-Stage Hypotheses Generation for Spoken Language Translation

Published:01 March 2009Publication History
Skip Abstract Section

Abstract

Spoken Language Translation (SLT) is the research area that focuses on the translation of speech or text between two spoken languages. Phrase-based and syntax-based methods represent the state-of-the-art for statistical machine translation (SMT). The phrase-based method specializes in modeling local reorderings and translations of multiword expressions. The syntax-based method is enhanced by using syntactic knowledge, which can better model long word reorderings, discontinuous phrases, and syntactic structure. In this article, we leverage on the strength of these two methods and propose a strategy based on multiple hypotheses generation in a two-stage framework for spoken language translation. The hypotheses are generated in two stages, namely, decoding and regeneration. In the decoding stage, we apply state-of-the-art, phrase-based, and syntax-based methods to generate basic translation hypotheses. Then in the regeneration stage, much more hypotheses that cannot be captured by the decoding algorithms are produced from the basic hypotheses. We study three regeneration methods: redecoding, n-gram expansion, and confusion network in the second stage. Finally, an additional reranking pass is introduced to select the translation outputs by a linear combination of rescoring models. Experimental results on the Chinese-to-English IWSLT-2006 challenge task of translating the transcription of spontaneous speech show that the proposed mechanism achieves significant improvements over the baseline of about 2.80 BLEU-score.

References

  1. Aiello, D., Cerrato, L., Delogu, C., and Di Carlo, A. 1999. EUTRANS project: FUB activity in spoken machine translation. In Proceedings of the Venezia per il Trattamento Automatico delle Lingue (VEXTAL’99).Google ScholarGoogle Scholar
  2. Amengual, J. C., Castano, A., Castellanos, A., Jimenez, V. M., Llorens, D., Marzal A., Prat, F., Vilar, J. M., Benedi, J. M., Casacuberta, F., Pastor, M., and Vidal, E. 2000. The EuTrans Spoken Language Translation System. J. Mach. Transl. 15, 75--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bangalore, S., Bordel, G., and Riccardi, G. 2001. Computing consensus translation from multiple machine translation systems. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU’01), 351--354.Google ScholarGoogle Scholar
  4. Bangalore, S. and Riccardi, G. 2002. Stochastic finite-state models for spoken language machine translation. J. Mach. Transl. 17, 3, 165--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Berger, A. L., Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., Kehler, A. S., and Mercer, R. L. 1996. Language translation apparatus and methods using context-based translation models. U.S. Patent No. 5,510,981.Google ScholarGoogle Scholar
  6. Besacier, L., Blanchon, H., Fouquet, Y., Guilbaud, J. P., Helme, S., Mazenot, S., Moraru, D., and Vaufreydaz, D. 2001. Speech translation for French in the NESPOLE! European project. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH’01).Google ScholarGoogle Scholar
  7. Brown, P. F., Della Pietra, V. J., Della Pietra, S. A., and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Linguist. 192, 263--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Casacuberta, F., Vidal, E., and Vilar, J. M. 2002. Architectures for speech-to-speech translation using finite-state models. In Proceedings of the Speech-to-Speech Translation Workshop, 39--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chen, B., Cattoni, R., Bertoldi, N., Cettolo, M., and Federico, M. 2005. The ITC-irst SMT System for IWSLT-2005. In Proceedings of the International Workshop for Spoken Language Translation (IWSLT’05), 98--104.Google ScholarGoogle Scholar
  10. Chen, B., Cettolo, M., and Federico, M. 2006. Reordering rules for phrase-based statistical machine translation. In Proceedings of the International Workshop for Spoken Language Translation (IWSLT’06).Google ScholarGoogle Scholar
  11. Chen, B., Federico, M., and Cettolo, M. 2007a. Better N-best translations through generative n-gram language models. In Proceedings of the Machine Translation Summit XI (MT’07).Google ScholarGoogle Scholar
  12. Chen, B., Sun, J., Jiang, H., Zhang M., and Aw, A. T. 2007b. I2R Chinese-English translation system for IWSLT-2007. In Proceedings of the International Workshop for Spoken Language Translation (IWSLT’07).Google ScholarGoogle Scholar
  13. Chen, B., Zhang M., Aw, A. T., and Li, H. 2008. Regenerating hypotheses for statistical machine translation. In Proceedings of the International Conference on Computer Linguistics (COLING’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chen, S. F. and Goodman, J. 1998. An empirical study of smoothing techniques for language modeling. Tech. rep. TR-10-98, Center for Research in Computing Technology, Harvard University.Google ScholarGoogle Scholar
  15. Doddington, G. 2002. Automatic evaluation of machine translation quality using N-gram co-occurrence statistics. In Proceedings of the Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics (HLT-NAACL’02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Eurospeech. 2003. Special session: Multilingual speech-to-speech translation. In Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH’03), 361--384.Google ScholarGoogle Scholar
  17. Fellbaum, C. Ed. 1998. WordNet: An Electronic Lexical Database. MIT Press.Google ScholarGoogle Scholar
  18. Fiscus, J. G. 1997. A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER). In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU’97), 347--352.Google ScholarGoogle ScholarCross RefCross Ref
  19. Frederking, R. and Nirenburg, S. 1994. Three heads are better than one. In Proceedings of the 4th ACL Conference on Applied Natural Language Processing (ANLP’94), 95--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Frederking, R., Rudnicky, A., and Hogan, C. 1997. Interactive speech translation in the DIPLOMAT project. In Proceedings of the Workshop on Spoken Language Translation (IWSLT’97).Google ScholarGoogle Scholar
  21. Gao, Y., Zhou, B., Diao, Z., Sorensen, J., and Picheny, M. 2002. MARS: A statistical semantic parsing and generation-based multilingual automatic Translation system. Mach. Transl. 17, 3, 185--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. He, Z., Mi, H., Liu, Y., Xiong, D., Luo, W., Huang, Y., Ren, Z., Lu, Y., and Liu, Q. 2007. The ICT statistical machine translation systems for IWSLT 2007. In Proceedings of the International Workshop for Spoken Language Translation (IWSLT’07).Google ScholarGoogle Scholar
  23. Hitoshi, I., Sumita, E., and Furuse, O. 1996. Spoken language translation method using examples. In Proceedings of the International Conference on Computer Linguistics (COLING’96). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hoge, H. 2002. Project proposal TC-STAR: Make speech-to-speech translation real. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC’02), 136--141.Google ScholarGoogle Scholar
  25. Horiguchi K. and Franz, A. 1997. A formal basis for spoken language translation by analogy. In Proceedings of the Workshop Spoken Language Translation in Conjunction with ACL/EACL’97 (IWSLT’97), 32--39.Google ScholarGoogle Scholar
  26. Hovy, E. 1994. PANGLOSS: Knowledge-based machine translation. In Proceedings of the Workshop on Human Language Technology (HLT’94), 478--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Hsiao, R., Venugopal, A., Kohler, T. Zhang, Y., Zollmann, P. C. A., Vogel, S., Black, A. W., Schultz, T., and Waibel, A. 2006. Optimizing components for handheld two-way speech translation for an English-Iraqi Arabic system. In Proceedings of the International Conference on Spoken Language Processing (ICSLP’06).Google ScholarGoogle Scholar
  28. Huang, F. and Papineni, K. 2007. Hierarchical system combination for machine translation. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07), 277--286.Google ScholarGoogle Scholar
  29. Koehn, P. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of Joint Conference on Empirical Methods in Natural Language Processing (EMNLP’04), 388--395.Google ScholarGoogle Scholar
  30. Koehn, P., Axelrod, A., Mayne, A. B., Callison-Burch, C., Osborne, M., and Talbot, D. 2005. Edinburgh system description for the 2005 IWSLT speech translation evaluation. In Proceedings of the Workshop Spoken Language Translation (IWSLT’05).Google ScholarGoogle Scholar
  31. Koehn, P., Och, F. J., and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of Human Language Technology Conference/North American Chapter on the Association for Computational Linguistics (HLT/NAACL’03), 127--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the Association of Computer Linguistics (ACL’07), 177--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kraif, O. and Chen, B. 2004. Combining clues for lexical level aligning using the Null hypothesis a roach. In Proceedings of the International Conference on Computer Linguistics (COLING’04), 1261--1264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kumar, S. and Byrne, W. 2004. Minimum Bayes-risk decoding for statistical machine translation. In Proceedings of the Workshop on Human Language Technology (HLT’04).Google ScholarGoogle Scholar
  35. Lavie, A., Langley, C., Waibel, A., Pianesi, F., Lazzari, G., Coletti, P., Taddei, L., and Balducci, F. 2001. Architecture and design considerations in NESPOLE! A speech translation system for e-commerce applications. In Proceedings of the 1st International Conference on Human Language Technology Research (HLT’01), J. Allan, Ed., 31--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Lee, Y.-S., Yi, W. S., Seneff, S., and Weinstein, C. 2001. Interlingua-based broad-coverage Korean-to-English translation in CCLINC. In Proceedings of the 1st International Conference on Human Language Technology Research (HLT’01). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Macherey, W. and Och, F. J. 2007. An empirical study on computing consensus translations from multiple machine translation systems. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07), 986--995.Google ScholarGoogle Scholar
  38. Matusov, E., Ueffing, N., and Ney, H. 2006. Computing consensus translation from multiple machine translation systems using enhanced hypotheses alignment. In Proceedings of the Conference on the European Chapter of the Association for Computational Linguistic (EACL’06).Google ScholarGoogle Scholar
  39. Melamed, I. D. 2000. Models of translational equivalence among words. Comput. Linguist. 262, 221--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ney, H., Nieben, S., Och, F. J., Sawaf, H., Tillmann, C., and Vogel, S. 2000. Algorithms for statistical translation of spoken language. In IEEE Trans. Speech Audio Process. 8, 1, 24--36.Google ScholarGoogle ScholarCross RefCross Ref
  41. Ney, H. 2003. The statistical approach to machine translation and a roadmap for speech translation. In Proceedings of European Conference on Speech Communication and Technology (EUROSPEECH’03), 361--364.Google ScholarGoogle Scholar
  42. Nirenburg, S., Carbonell, J., Tomita, M., and Goodman, K. 1992. Machine Translation: A Knowledge-Based Approach. Morgan Kaufmann Publishers, San Mateo, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the Association of Computer Linguistics (ACL’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Och, F. J. and Ney, H. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the Association of Computer Linguistics (ACL’02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Och, F. J. and Ney, H. 2003. A systematic comparison of various statistical alignment models. Comput. Linguist. 291, 19--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Papineni, K., Roukos, S., Ward, T., and Zhu, W. J. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the Association of Computer Linguistics (ACL’02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Paul, M., Doi, T., Hwang, Y., Imamura, K., Okuma, H., and Sumita, E. 2005. Nobody is perfect: ATR’s hybrid a roach to spoken language translation. In Proceedings of the Workshop Spoken Language Translation (IWSLT’05), 55--62.Google ScholarGoogle Scholar
  48. Quirk, C. and Menezes, A. 2006. Do we need phrases? Challenging the conventional wisdom in SMT. In Proceedings of the International Conference on Computer Linguistics (COLING’06), 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Rayner, M. and Bouillon, P. 1995. Hybrid transfer in an English-French spoken language translator. In Proceedings of IA’95, 153--162.Google ScholarGoogle Scholar
  50. Rayner, M. and Carter, D. 1997. Hybrid language processing in the spoken language translator. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Rosti, A., Ayan, N. F., Xiang, B., Matsoukas, S., Schwartz, R., and Dorr, B. 2007a. Combining outputs from multiple machine translation systems. In Proceedings of the Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics (HLT-NAACL’03), 228--235.Google ScholarGoogle Scholar
  52. Rosti, A., Matsoukas, S., and Schwartz, R. 2007b. Improved word-level system combination for machine translation. In Proceedings of the Association of Computer Linguistics (ACL’07).Google ScholarGoogle Scholar
  53. Sebastian, S., Zong, C., Reichert, J., Cao, W., Kolss, M., Xie, G., Peterson, K., Ding, P., Arranz, V., Yu, J., and Waibel, A. 2006. Speech-to-speech translation services for the Olympic games 2008. In Proceedings of the 3rd Joint Workshop on Machine Learning and Multimodal Interaction (MLMI’06).Google ScholarGoogle Scholar
  54. Seligman, M. 2000. Nine issues in speech translation. Machine Translation 15, 149--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Sim, K. C., Byrne, W. J., Gales, M. J. F., Sahbi, H., and Woodland, P. C. 2007. Consensus network decoding for statistical machine translation system combination. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’07).Google ScholarGoogle Scholar
  56. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas (AMTA’06).Google ScholarGoogle Scholar
  57. Stolcke, A. 2002. SRILM -- An extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language (ICSLP’02), 901--904.Google ScholarGoogle Scholar
  58. Sugaya, F., Takezawa, T., Yokoo, A., Sagisaka, Y., and Yamamoto, S. 1999. End-to-end evaluation in ATR-MATRIX: Speech translation system between English and Japanese. In Proceedings of European Conference on Speech Communication and Technology (EUROSPEECH’99), 2431--2434.Google ScholarGoogle Scholar
  59. Sumita, E., Yamada, S., Yamamoto, K., Paul, M., Kashioka, H., Ishikawa, K., and Shirai S. 1999. Solutions to problems inherent in spoken-language translation: The ATR-MATRIX a roach. In Proceedings of Machine Translation Summit VII (MT’99), 229--235.Google ScholarGoogle Scholar
  60. Sumita, E., Akiba, Y., and Doi, T. 2003. A corpus-centered a roach to spoken language translation. In Proceedings of the Conference of the Association for Computational Linguistics (ACL’03), 171--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Takezawa, T., Sumita, E., Sugaya, F., Yamamoto, H., and Yamamoto, S. 2002. Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’02).Google ScholarGoogle Scholar
  62. Takezawa, T., Morimoto, T., Sagisaka, Y., Campbell, N., Iida, H., Sugaya, F., Yokoo, A., and Yamamoto, S. 1998. A Japanese-to-English speech translation system: ATR-MATRIX. In Proceedings of the International Conference on Spoken Language Processing (ICSLP’98).Google ScholarGoogle Scholar
  63. Tillmann, C. and Zhang, T. 2005. A localized prediction model for statistical machine translation. In Proceedings of the Conference of the Association for Computational Linguistics (ACL’05), 557--564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Ueffing, N. and Ney, H. 2007. Word-level confidence estimation for machine translation. Comput. Linguist. 331, 9--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Vidal, E. 1997. Finite-state speech-to-speech translation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97). Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Vogel, S., Hewavitharana, S., Kolss, M., and Waibel, A. 2004. The ISL statistical translation system for spoken language translation. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT’04).Google ScholarGoogle Scholar
  67. Waibel, A., Jain, A., Mcnair, A., Saito, H., Hauptmann, A., and Tebelskis, J. 1991. JANUS: A speech-to-speech translation system using connectionist and symbolic processing strategies. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91). Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Wahlster, W., Ed. 2000. Verbmobil: Foundations of Speech-to-Speech Translations. Springer Verlag, Berlin.Google ScholarGoogle Scholar

Index Terms

  1. Two-Stage Hypotheses Generation for Spoken Language Translation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian Language Information Processing
      ACM Transactions on Asian Language Information Processing  Volume 8, Issue 1
      March 2009
      75 pages
      ISSN:1530-0226
      EISSN:1558-3430
      DOI:10.1145/1482343
      Issue’s Table of Contents

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 March 2009
      • Accepted: 1 November 2008
      • Revised: 1 August 2008
      • Received: 1 February 2008
      Published in talip Volume 8, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader