Identification of Pronunciation Errors in L2 English Speech by Spanish Speaking Natives for S-Impure Sounds

Author
Valdiviezo Mora, José Aristh
https://orcid.org/0000-0001-6333-4663
Metadata
Show full item record
Export citation
Abstract
In the field of Computer-Aided Pronunciation Training (CAPT) systems, there are several approaches to detect pronunciation errors. Among those, the cutting-edge in the past years has been Deep Neural Networks (DNN), but this approach is generally only feasible when a high quantity and quality of data are available. In this project, a big database was not available. For that reason a dataset of 1953 audio files was sampled and collected; however a database of this size is considered small and not entirely suitable for a DNN model. Therefore classical supervised learning techniques were revised, applied and tested in this work. The main goal was to identify the pronunciation errors of native Spanish speakers at pronouncing S-impure words. The database build from the 1953 tagged audios was binary classified by three independent judges to identify those recordings that have an S-impure error and later processed to extract key features, Mel Frequency Cepstral Coefficients (MFCC), Spectral-Flux (SF), Root Mean Square Energy (RMSE) and Zero-Crossing (ZR) to train with a K-Nearest Neighbors (KNN), Random Forest (RF) and Support Vector Machine (SVM) algorithms. The resulting model obtained with SVM using the grid-search technique for tuning Hyperparameters provided the best solution. The obtained results show that the model was able to detect errors with an accuracy of 85% which leads to a solid result given that the dataset had high noise levels.
Collections
The following license files are associated with this item: