Abstract
This paper proposes a new output-based system for prediction of the subjective speech quality, and evaluates its performance. The system is based on computing objective distance measures, such as the median minimum distance, between perceptually-based parameter vectors representing the voiced parts of the speech signal to appropriately matching reference vectors extracted from a pre-formulated codebook. The distance measures are then mapped into equivalent Mean Opinion scores (MOS) using regression. The codebook of the system is formed by optimally clustering large number of speech parameter vectors extracted from undistorted source speech database. The required clustering and matching processes are achieved by using an efficient data mining technique known as the Self-Organising Map. The perceptual-based speech parameters are derived using Perceptual Linear Prediction (PLP) and Bark Spectrum analyses. Reported evaluation results show that the proposed system is robust against speaker, utterance and distortion variations.
Original language | English |
---|---|
Pages (from-to) | V-633-V-636 |
Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Volume | 5 |
Publication status | Published - 2004 |
Event | Proceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada Duration: 17 May 2004 → 21 May 2004 |