TY - GEN
T1 - Enhanced output-based perceptual measure for predicting subjective quality of speech
AU - Mahdi, A. E.
AU - Picovici, D.
PY - 2005
Y1 - 2005
N2 - This paper presents an enhanced version of a non-intrusive measure for assessment of speech quality of voice communication systems and evaluates its performance. The new measure, which uses only the output of the system, is based on measuring perception-based objective auditory distances between voiced parts of the output (processed) speech whose quality is to be evaluated to appropriately matching references extracted from one of four pre-formulated codebooks, depending on their estimated pitch values. The codebooks are formed by optimally clustering large number of parametric speech vectors extracted from a database of clean speech records. The measured auditory distances are then mapped into equivalent subjective Mean Opinion Scores (MOS). The required clustering and matching process was effected by using an efficient data-mining tool known as the Self-Organizing Map (SOM). The short-time Bark Spectrum analysis is used in order to achieve perceptionbased, speaker-independent parametric representation of the speech. Reported evaluation results show that the proposed enhanced speech quality assessment method provides quality scores that are highly correlated with MOS obtained by formal subjective listening tests.
AB - This paper presents an enhanced version of a non-intrusive measure for assessment of speech quality of voice communication systems and evaluates its performance. The new measure, which uses only the output of the system, is based on measuring perception-based objective auditory distances between voiced parts of the output (processed) speech whose quality is to be evaluated to appropriately matching references extracted from one of four pre-formulated codebooks, depending on their estimated pitch values. The codebooks are formed by optimally clustering large number of parametric speech vectors extracted from a database of clean speech records. The measured auditory distances are then mapped into equivalent subjective Mean Opinion Scores (MOS). The required clustering and matching process was effected by using an efficient data-mining tool known as the Self-Organizing Map (SOM). The short-time Bark Spectrum analysis is used in order to achieve perceptionbased, speaker-independent parametric representation of the speech. Reported evaluation results show that the proposed enhanced speech quality assessment method provides quality scores that are highly correlated with MOS obtained by formal subjective listening tests.
UR - http://www.scopus.com/inward/record.url?scp=84863709173&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84863709173
SN - 1604238216
SN - 9781604238211
T3 - 13th European Signal Processing Conference, EUSIPCO 2005
SP - 700
EP - 703
BT - 13th European Signal Processing Conference, EUSIPCO 2005
T2 - 13th European Signal Processing Conference, EUSIPCO 2005
Y2 - 4 September 2005 through 8 September 2005
ER -