TY - JOUR
T1 - The Study of Malay's Prosodic Features Impact on Classical Arabic Accents Recognition
AU - Ibrahim, Noor Jamaliah
AU - Idris, Mohd Yamani Idna
AU - Yusoff, M. Y.Zulkifli Mohd
AU - Ramli, Roziana
AU - Raja Yusof, Raja Jamilah
PY - 2023/7/28
Y1 - 2023/7/28
N2 - Modeling individual's variation in speech pattern can be challenging in Automatic Speech Recognition (ASR). In Classical Arabic (CA) language, 20 Quranic accents are permitted for Quranic recitation. An ASR system for CA with accent detection requires a modeling method that can capture speech pattern changes. Here, we study the accentual influences on Malay speakers' pronunciation and its prosodic impacts towards ASR system for CA language with seven Quranic accents identification. The proposed ASR system was developed over three stages. First, a dataset of Surah Al-Fatihah recitation was recorded from 14 Malay speakers in seven Quranic accents, forming a total of 5,684 words. Second, various spectral and prosodic features are extracted from the dataset for further classification process. The final stage includes training and testing the classification model. The existing ASR systems are often enabled by Gaussian Mixture Models (GMM) because of its capability to represent a wide range of sample distributions. However, GMM is susceptible to overfitting when the model complexity is high, due to the presence of singularities. To support identification of seven Quranic accents, Universal Background Model (UBM) is adapted to GMM using Maximum A Posteriori (MAP) estimation method. The UBM models were trained over each of Quranic accents, and combined to establish final UBM with 512 mixture components. The proposed ASR system utilizing the GMM-UBM outperformed k-NN, GMM, and GMM-iVector in identifying Al-Fatihah recitation to the corresponding Quranic accents. The GMM-UBM yields a testing accuracy of 86.148%, which is an increment of 4.435% from utilizing GMM alone.
AB - Modeling individual's variation in speech pattern can be challenging in Automatic Speech Recognition (ASR). In Classical Arabic (CA) language, 20 Quranic accents are permitted for Quranic recitation. An ASR system for CA with accent detection requires a modeling method that can capture speech pattern changes. Here, we study the accentual influences on Malay speakers' pronunciation and its prosodic impacts towards ASR system for CA language with seven Quranic accents identification. The proposed ASR system was developed over three stages. First, a dataset of Surah Al-Fatihah recitation was recorded from 14 Malay speakers in seven Quranic accents, forming a total of 5,684 words. Second, various spectral and prosodic features are extracted from the dataset for further classification process. The final stage includes training and testing the classification model. The existing ASR systems are often enabled by Gaussian Mixture Models (GMM) because of its capability to represent a wide range of sample distributions. However, GMM is susceptible to overfitting when the model complexity is high, due to the presence of singularities. To support identification of seven Quranic accents, Universal Background Model (UBM) is adapted to GMM using Maximum A Posteriori (MAP) estimation method. The UBM models were trained over each of Quranic accents, and combined to establish final UBM with 512 mixture components. The proposed ASR system utilizing the GMM-UBM outperformed k-NN, GMM, and GMM-iVector in identifying Al-Fatihah recitation to the corresponding Quranic accents. The GMM-UBM yields a testing accuracy of 86.148%, which is an increment of 4.435% from utilizing GMM alone.
KW - Automatic speech recognition (ASR)
KW - Gaussian mixture model-universal background model (GMM-UBM)
KW - Malay speakers
KW - Quranic accents
UR - http://www.scopus.com/inward/record.url?scp=85166321810&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3299814
DO - 10.1109/ACCESS.2023.3299814
M3 - Article
AN - SCOPUS:85166321810
SN - 2169-3536
VL - 11
SP - 94589
EP - 94612
JO - IEEE Access
JF - IEEE Access
ER -