TY - JOUR
T1 - Arabic sentiment analysis using GCL-based architectures and a customized regularization function
AU - Mhamed, Mustafa
AU - Sutcliffe, Richard
AU - Sun, Xia
AU - Feng, Jun
AU - Retta, Ephrem Afele
N1 - Publisher Copyright:
© 2023 Karabuk University
PY - 2023/7
Y1 - 2023/7
N2 - Sentiment analysis aims to extract emotions from textual data; with the proliferation of various social media platforms and the flow of data, particularly in the Arabic language, significant challenges have arisen, necessitating the development of various frameworks to handle issues. In this paper, we firstly design an architecture called Gated Convolution Long (GCL) to perform Arabic Sentiment Analysis. GCL can overcome difficulties with lengthy sequence training samples, extracting the optimal features that help improve Arabic sentiment analysis performance for binary and multiple classifications. The proposed method trains and tests in various Arabic datasets; The results are better than the baselines in all cases. GCL includes a Custom Regularization Function (CRF), which improves the performance and optimizes the validation loss. We carry out an ablation study and investigate the effect of removing CRF. CRF is shown to make a difference of up to 5.10% (2C) and 4.12% (3C). Furthermore, we study the relationship between Modern Standard Arabic and five Arabic dialects via a cross-dialect training study. Finally, we apply GCL through standard regularization (GCL+L1, GCL+L2, and GCL+LElasticNet) and our Lnew on two big Arabic sentiment datasets; GCL+Lnew gave the highest results (92.53%) with less performance time.
AB - Sentiment analysis aims to extract emotions from textual data; with the proliferation of various social media platforms and the flow of data, particularly in the Arabic language, significant challenges have arisen, necessitating the development of various frameworks to handle issues. In this paper, we firstly design an architecture called Gated Convolution Long (GCL) to perform Arabic Sentiment Analysis. GCL can overcome difficulties with lengthy sequence training samples, extracting the optimal features that help improve Arabic sentiment analysis performance for binary and multiple classifications. The proposed method trains and tests in various Arabic datasets; The results are better than the baselines in all cases. GCL includes a Custom Regularization Function (CRF), which improves the performance and optimizes the validation loss. We carry out an ablation study and investigate the effect of removing CRF. CRF is shown to make a difference of up to 5.10% (2C) and 4.12% (3C). Furthermore, we study the relationship between Modern Standard Arabic and five Arabic dialects via a cross-dialect training study. Finally, we apply GCL through standard regularization (GCL+L1, GCL+L2, and GCL+LElasticNet) and our Lnew on two big Arabic sentiment datasets; GCL+Lnew gave the highest results (92.53%) with less performance time.
KW - Arabic sentiment analysis (ASA)
KW - Custom regularization function (CRF)
KW - Gated convolution long (GCL)
KW - Natural language processing (NLP)
UR - http://www.scopus.com/inward/record.url?scp=85160550825&partnerID=8YFLogxK
U2 - 10.1016/j.jestch.2023.101433
DO - 10.1016/j.jestch.2023.101433
M3 - Article
AN - SCOPUS:85160550825
SN - 2215-0986
VL - 43
JO - Engineering Science and Technology, an International Journal
JF - Engineering Science and Technology, an International Journal
M1 - 101433
ER -