research-article

Robust speech recognition based on binaural speech enhancement system as a preprocessing step

Authors:
Cuong Nguyen Quoc

Hanoi University of Science and Technology, Hanoi, Vietnam

Hanoi University of Science and Technology, Hanoi, Vietnam
View Profile

,
Dung Tran Tien

Hanoi University of Science and Technology, Hanoi, Vietnam

Hanoi University of Science and Technology, Hanoi, Vietnam
View Profile

,
Khoa Nguyen Dang

Hanoi University of Science and Technology, Hanoi, Vietnam

Hanoi University of Science and Technology, Hanoi, Vietnam
View Profile

,
Binh Nguyen Huu

Hanoi University of Science and Technology, Hanoi, Vietnam

Hanoi University of Science and Technology, Hanoi, Vietnam
View Profile

SoICT '12: Proceedings of the 3rd Symposium on Information and Communication TechnologyAugust 2012Pages 91–96https://doi.org/10.1145/2350716.2350732

Published:23 August 2012Publication History

SoICT '12: Proceedings of the 3rd Symposium on Information and Communication Technology

Pages 91–96

ABSTRACT

In this paper, we present a robust speech recognition based on binaural speech enhancement system as a preprocessing step. This system uses an existing dereverberation technique followed by a spatial masking-based noise removal algorithm where only signals coming from the desired directions are retained by using a threshold angle. While state-of-the art approaches fix the threshold angle heuristically over all time frames, in this paper, we propose to consider an adaptive computation where this threshold angle is first learned in several noise-only frames and then updated frame by frame. Speech recognition results in real environment show the effectiveness of the proposed speech enhancement approach.

References

M. P. Cooke, P. Green, L. Josifovski, and A. Vizinho. Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication, 34: 267ï£¡--285, 2001. Google ScholarDigital Library
N. Q. K. Duong, E. Vincent, and R. Gribonval. Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. on Audio, Speech and Language Processing, 18(7): 1830--1840, 2010. Google ScholarDigital Library
D. T. T. et al. Speech enhancement using combination of dereverberation and noise reduction for robust speech recognition. In Proceedings of the Second Symposium on Information and Communication Technology, 2011. Google ScholarDigital Library
C. Kim, K. Kumar, and R. M. Stern. Binaural sound source separation motivated by auditory processing. In Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pages 4574--4577, 2011.Google ScholarCross Ref
C. Kim and R. M. Stern. Nonlinear enhancement of onset for robust speech recognition. In Proc. Int. Conf. on Spoken Language Processing (INTERSPEECH), pages 2058--2061, 2010.Google ScholarCross Ref
G. Kim and P. Loizou. Improving speech intelligibility in noise using environment-optimized algorithms. IEEE Trans. Audio, Speech, Language Processing, 18(8): 2080--2090, 2010. Google ScholarDigital Library
T. F. Kleinschmidt. Robust speech recognition using speech enhancement. PhD thesis, Queensland University of Technology, March 2010.Google Scholar
S. Makino, T.-W. Lee, and H. Sawada. Blind Speech Separation. Springer, 2007.Google ScholarCross Ref
H. Park and R. M. Stern. Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero crossings. Speech Communication, 51(1): 15ï£¡--25, 2009. Google ScholarDigital Library
B. Raj and R. M. Stern. Missing-feature methods for robust automatic speech recognition. IEEE Signal Processing Magazine, 22(5): 101--116, 2005.Google ScholarCross Ref
F. I. Shoji Kajita, Kazuya Takeda. A binaural speech processing methos using subband-crosscerrelation analysis for noise robust recognition. In IEEE. Conference Acoustic, Speech, and Signal Processing, 1997. Google ScholarDigital Library
M. Slaney. Auditory toolbox. Technical report, Interval Research Corporation, 1998.Google Scholar
S. Srinivasan and D. L. Wang. Robust speech recognition by integrating speech separation and hypothesis testing. Speech Communication, 52: 72ï£¡--81, 2010. Google ScholarDigital Library
E. Vincent, S. Araki, F. Theis, G. Nolte, P. Bofill, H. Sawada, A. Ozerov, V. Gowreesunker, D. Lutter, and N. Q. K. Duong. The Signal Separation Campaign (2007--2010): Achievements and remaining challenges. Signal Processing, 2011. Google ScholarDigital Library

Index Terms

Robust speech recognition based on binaural speech enhancement system as a preprocessing step

Recommendations

Speech enhancement using combination of dereverberation and noise reduction for robust speech recognition
SoICT '11: Proceedings of the 2nd Symposium on Information and Communication Technology

In this paper, we describe a speech enhancement approach for robust speech recognition. This approach consists of two stages to solve both current problems of speech recognition: reverberation and noise. Firstly, speech signal is dereveberated by ...
Read More
Speech enhancement for robust automatic speech recognition

Evaluation of baseline CHiME3 recogniser in diverse range of acoustic conditions.Performance curves indicate relative influence of noise and reverberation.Evaluation of 6 different speech enhancement pipelines.Deverberation and beamforming dramatically ...
Read More
Combined speech enhancement and auditory modelling for robust distributed speech recognition

The performance of automatic speech recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering noise sources, and convolutional noise arising from transmission channel ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SoICT '12: Proceedings of the 3rd Symposium on Information and Communication Technology
August 2012
290 pages
ISBN:9781450312325
DOI:10.1145/2350716
Conference Chair:
Giang Nguyen Trong
HUST, Vietnam
,
General Chairs:
Ladislave Hluchy
Slovak Academy of Sciences, Slovakia
,
Thang Huynh Quyet
HUST, Vietnam
,
Program Chairs:
Eric Castelli
MICA, France-Vietnam
,
Khanh Tran Duc
HUST, Vietnam
,
Mai Luong Chi
IoIT, VAST, Vietnam
,
Viet Tran
Slovak Academy of Sciences, Slovakia
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 August 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dereverberation
k-mean
robust speech recognition
spatial masking
speech enhancement
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate147of318submissions,46%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 82
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Robust speech recognition based on binaural speech enhancement system as a preprocessing step

SoICT '12: Proceedings of the 3rd Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Speech enhancement using combination of dereverberation and noise reduction for robust speech recognition

Speech enhancement for robust automatic speech recognition

Combined speech enhancement and auditory modelling for robust distributed speech recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Robust speech recognition based on binaural speech enhancement system as a preprocessing step

SoICT '12: Proceedings of the 3rd Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Speech enhancement using combination of dereverberation and noise reduction for robust speech recognition

Speech enhancement for robust automatic speech recognition

Combined speech enhancement and auditory modelling for robust distributed speech recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media