Identification of Pronunciation Errors in L2 English Speech by Spanish Speaking Natives for S-Impure Sounds

Autor
Valdiviezo Mora, José Aristh
https://orcid.org/0000-0001-6333-4663
Metadatos
Mostrar el registro completo del ítem
Export citation
Resumen
In the field of Computer-Aided Pronunciation Training (CAPT) systems, there are several approaches to detect pronunciation errors. Among those, the cutting-edge in the past years has been Deep Neural Networks (DNN), but this approach is generally only feasible when a high quantity and quality of data are available. In this project, a big database was not available. For that reason a dataset of 1953 audio files was sampled and collected; however a database of this size is considered small and not entirely suitable for a DNN model. Therefore classical supervised learning techniques were revised, applied and tested in this work. The main goal was to identify the pronunciation errors of native Spanish speakers at pronouncing S-impure words. The database build from the 1953 tagged audios was binary classified by three independent judges to identify those recordings that have an S-impure error and later processed to extract key features, Mel Frequency Cepstral Coefficients (MFCC), Spectral-Flux (SF), Root Mean Square Energy (RMSE) and Zero-Crossing (ZR) to train with a K-Nearest Neighbors (KNN), Random Forest (RF) and Support Vector Machine (SVM) algorithms. The resulting model obtained with SVM using the grid-search technique for tuning Hyperparameters provided the best solution. The obtained results show that the model was able to detect errors with an accuracy of 85% which leads to a solid result given that the dataset had high noise levels.
Colecciones
El ítem tiene asociados los siguientes ficheros de licencia: