A Wavelet-Based Front end for Robust ASR-Edición Única
Export citation
Abstract
In this work we present results attained by noisy speech recognition experiments using
wavelet analysis schemes. It is shown that under white noisy signals the wavelet parameters outperform the Mel-Frequency Cepstral Coefficients (MFCC). The main difference between the wavelet derived coefficients and the traditional MFCC consists in the computation of the spectrum, since the proposed parameters apply a wavelet packet
transform instead a discrete Fourier transform.
The filters in the wavelet packet transforms used in this work are Daubechies 20, Beylkin 18 and Vaidyanathan 24. The Daubechies filters maximize the smoothness of the
associated scaling function by maximizing the rate of decay of its Fourier transform
[Daubechies]. The Beylkin's filter was designed by placing roots for the frequency response
polynomial close to the Nyquist frequency on the real axis, thus concentrating power
spectrum energy in the desired band. Vaidyanathan's filter was optimized for its length to satisfy standard requirements for effective speech coding [Wikerhauser].
The experimental work show that under a noisy continuous spoken digits task the word accuracy improves up to a 32% compared with the MFCC results.