Identiﬁcation of Pronunciation Errors in L2 English Speech by Spanish Speaking Natives for S-Impure Sounds
Valdiviezo Mora, José Aristh
MetadataShow full item record
In the ﬁeld of Computer-Aided Pronunciation Training (CAPT) systems, there are several approaches to detect pronunciation errors. Among those, the cutting-edge in the past years has been Deep Neural Networks (DNN), but this approach is generally only feasible when a high quantity and quality of data are available. In this project, a big database was not available. For that reason a dataset of 1953 audio ﬁles was sampled and collected; however a database of this size is considered small and not entirely suitable for a DNN model. Therefore classical supervised learning techniques were revised, applied and tested in this work. The main goal was to identify the pronunciation errors of native Spanish speakers at pronouncing S-impure words. The database build from the 1953 tagged audios was binary classiﬁed by three independent judges to identify those recordings that have an S-impure error and later processed to extract key features, Mel Frequency Cepstral Coefﬁcients (MFCC), Spectral-Flux (SF), Root Mean Square Energy (RMSE) and Zero-Crossing (ZR) to train with a K-Nearest Neighbors (KNN), Random Forest (RF) and Support Vector Machine (SVM) algorithms. The resulting model obtained with SVM using the grid-search technique for tuning Hyperparameters provided the best solution. The obtained results show that the model was able to detect errors with an accuracy of 85% which leads to a solid result given that the dataset had high noise levels.
The following license files are associated with this item: