Phone clustering using the Bhattacharyya distance

Brian Mak*, Etienne Barnard

*Corresponding author for this work

Research output: Contribution to conferenceConference Paperpeer-review

94 Citations (Scopus)

Abstract

In this paper we study using the classification-based Bhattacharyya distance measure to guide biphone clustering. The Bhattacharyya distance is a theoretical distance measure between two Gaussian distributions which is equivalent to an upper bound on the optimal Bayesian classification error probability. It also has the desirable properties of being computationally simple and extensible to more Gaussian mixtures. Using the Bhattacharyya distance measure in a data-driven approach together with a novel 2-Level Agglomerative Hierarchical Biphone Clustering algorithm, generalized left/fight biphones (BGBs) are derived. A neural-net based phone recognizer trained on the BGBs is found to have better frame-level phone recognition than one trained on generalized biphones (BCGBs) derived from a set of commonly-used broad categories. We further evaluate the new BGBs on an isolated-word recognition task of perplexity 40 and obtain a 16.2% error reduction over the broad-category generalized biphones (BCGBs) and a 41.8% error reduction over the monophones.

Original languageEnglish
Pages2005-2008
Number of pages4
Publication statusPublished - 1996
Externally publishedYes
EventProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA
Duration: 3 Oct 19966 Oct 1996

Conference

ConferenceProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)
CityPhiladelphia, PA, USA
Period3/10/966/10/96

Fingerprint

Dive into the research topics of 'Phone clustering using the Bhattacharyya distance'. Together they form a unique fingerprint.

Cite this