short-paper

Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace

Authors:
Masafumi Nishida

Department of Informatics, Shizuoka University, Shizuoka, Japan

Department of Informatics, Shizuoka University, Shizuoka, Japan
View Profile

,
Seiichi Yamamoto

Department of Information and Computer Science, Doshisha University, Kyoto, Japan

Department of Information and Computer Science, Doshisha University, Kyoto, Japan
View Profile

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia IndexingJune 2017Article No.: 7Pages 1–5https://doi.org/10.1145/3095713.3095721

Published:19 June 2017Publication History

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

Pages 1–5

ABSTRACT

Speech feature variations are mainly attributed to variations in phonetic and speaker information included in speech data. If these two types of information are separated from each other, more robust speaker clustering can be achieved. Principal component analysis transformation can separate speaker information from phonetic information, under the assumption that a space with large within-speaker variance is a "phonetic subspace" and a space within-speaker variance is a "phonetic sub-space". We propose a speaker clustering method based on non-negative matrix factorization using a Gaussian mixture model trained in the speaker subspace. We carried out comparative experiments of the proposed method with conventional methods based on Bayesian information criterion and Gaussian mixture model in an observation space. The experimental results showed that the proposed method can achieve higher clustering accuracy than conventional methods.

References

S. E. Tranter and D. A. Reynolds, "An Overview of Automatic Speaker Diarization Systems", IEEE Transactions on AudioGoogle Scholar
D. A. Reynolds and P. Torres-Carrasquillo, "Approaches and Applications of Audio Diarization", Proc. ICASSP, Vol.5, pp. 953--956, 2005. Google ScholarCross Ref
S. Chen and P. Gopalakrishnan, "Speaker Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion", Proc. DARPA Broadcast News Transcription and Understanding Workshop, pp. 127--132, 1998.Google Scholar
S. Cheng, H. Wang, H. Fu, "BIC-based Speaker Segmentation Using Divide-and-conquer Strategies with Application to Speaker Diarization", IEEE Transactions, Vol.18, pp. 141--157, 2010.Google ScholarDigital Library
K. Iso, "Speaker Clustering Using Vector Quantization and Spectral Clustering", Proc. ICASSP, pp. 4986--4989, 2010. Google ScholarCross Ref
M. Nishida and T. Kawahara, "Speaker Model Selection Based on the Bayesian Information Criterion Applied to Unsupervised Speaker Indexing", IEEE Transactions on Speech and Audio Processing, Vol.13, No.4, pp. 583--592, 2005. Google ScholarCross Ref
D. A. Reynolds, E. Singer, B. A. Carlson, G. C. O'Leary, J. J. McLaughlin, and M. A. Zissman, "Blind Clustering of Speech Utterances based on Speaker and Language Characteristics", Proc. ICSLP, pp. 3193--3196, 1998.Google Scholar
L. Viet Bac, O. Mella, and D. Fohr, "Speaker Diarization using Normalized Cross Likelihood Ratio", Proc. Interspeech, pp. 1869--1872, 2007.Google Scholar
M. Nishida and Y. Ariki, "Speaker Recognition by Separating Phonetic Space and Speaker Space", Proc. EUROSPEECH, Vol. 2, pp. 1381--1384, 2001.Google Scholar
M. Nishida and S. Yamamoto, "Speaker Clustering Based on Non-negative Matrix Factorization", Proc. Interspeech, pp. 949--952, 2011.Google Scholar
Sneath, P. H. A. and Sokal, R. R, "Numerical Taxonomy", W. H. Freeman and Company, 1973.Google Scholar
K. Maekawa, "Corpus of Spontaneous Japanese: Its Design and Evaluation", Proc. ISCA & IEEE Workshop on SSPR, pp. 7--12, 2003.Google Scholar

Index Terms

Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace

Recommendations

Rapid speaker adaptation in latent speaker space with non-negative matrix factorization

A novel speaker adaptation algorithm based on Gaussian mixture weight adaptation is described. A small number of latent speaker vectors are estimated with non-negative matrix factorization (NMF). These latent vectors encode the distinctive systematic ...
Read More
Speaker Verification Using Adapted Bounded Gaussian Mixture Model
2018 IEEE International Conference on Information Reuse and Integration (IRI)
In this paper, we propose the application of bounded Gaussian mixture model (BGMM) to speaker verification. In the proposed approach, BGMM is employed for universal background model (UBM) and adapted speaker model. The proposed UBM is a large BGMM trained ...
Read More
Speaker Verification Using Adapted Gaussian Mixture Models

Reynolds, Douglas A., Quatieri, Thomas F., and Dunn, Robert B., Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing10(2000), 19 41.In this paper we describe the major elements of MIT Lincoln Laboratory's Gaussian ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing
June 2017
237 pages
ISBN:9781450353335
DOI:10.1145/3095713

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 June 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ACM proceedings
text tagging
Qualifiers
- short-paper
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 57
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Rapid speaker adaptation in latent speaker space with non-negative matrix factorization

Speaker Verification Using Adapted Bounded Gaussian Mixture Model

Speaker Verification Using Adapted Gaussian Mixture Models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Speaker Clustering Based on Non-Negative Matrix Factorization Using Gaussian Mixture Model in Complementary Subspace

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Rapid speaker adaptation in latent speaker space with non-negative matrix factorization

Speaker Verification Using Adapted Bounded Gaussian Mixture Model

Speaker Verification Using Adapted Gaussian Mixture Models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media