-
Speech confusion index (Phi): A confusion-based speech quality indicator and recognition rate prediction for dysarthria
- Back
Metadata
Document Title
Speech confusion index (Phi): A confusion-based speech quality indicator and recognition rate prediction for dysarthria
Author
Kayasith P, Theeramunkong T
Name from Authors Collection
Affiliations
Thammasat University; National Science & Technology Development Agency - Thailand; National Electronics & Computer Technology Center (NECTEC)
Type
Article
Source Title
COMPUTERS & MATHEMATICS WITH APPLICATIONS
ISSN
0898-1221
Year
2009
Volume
58
Issue
7
Open Access
Bronze
Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI
10.1016/j.camwa.2009.06.051
Format
Abstract
This paper presents an automated method to help us assess the speech quality of a dysarthric speaker, in place of laborious and subjective manual methods. The assessment result can be used as a good indicator for predicting the accuracy of speech recognition. The so-called speech confusion index (Phi) is proposed to measure the speech disorder severity of a speaker in terms of how easily his/her speech signal may be misrecognized to other unintended words. Based on signal processing without any high-level information, the dynamic-time-warping technique incorporated with adaptive slope constraint and accumulative mismatch score is used to measure a distance between any two speech signals of a same word or two different words. Compared to the articulatory and intelligibility tests, the proposed indicator was shown to have more predictability on the recognition rates obtained from the Hidden Markov Model (HMM) and Artificial Neural Networks (ANN). Based on three evaluation criteria, namely root-mean-square difference. correlation coefficient and rank-order inconsistency, the experimental results on a phoneme-balance set showed that P achieved better prediction than both articulatory and intelligibility tests. Another experiment on a reduced training set is made to investigate the robustness of the proposed indicator. Finally, a detailed analysis of speech confusion is done at the phoneme level. (C) 2009 Elsevier Ltd. All rights reserved.
Industrial Classification
Knowledge Taxonomy Level 1
Knowledge Taxonomy Level 2
Knowledge Taxonomy Level 3
Funding Sponsor
Royal Golden Jubilee Ph.D. Program; Thailand Research Fund (TRF) [PHD/0267/2545]
License
Copyright
Rights
Publisher
Publication Source
WOS