Detecting Offensive Language on Arabic Social Media Using Deep Learning

Hanane Mohaouchane, Asmaa Mourhir, Nikola S. Nikolov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Offensive content on social media such as verbal attacks, demeaning comments or hate speech has many negative effects on its users. The automatic detection of offensive language on Arabic social media is an important step towards the regulation of such content for Arabic speaking users of social media. This paper presents the results of evaluating the performance of four different neural network architectures for this task: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), Bi-LSTM with attention mechanism, and a combined CNN-LSTM architecture. These networks are trained and tested on a labeled dataset of Arabic YouTube comments. We run this dataset through a series of pre-processing steps and use Arabic word embeddings to represent the comments. We also apply Bayesian optimization techniques to tune the hyperparameters of the neural network models. We train and test each network using 5-fold cross validation. The CNN-LSTM achieves the highest recall (83.46%), followed by the CNN (82.24%), the Bi-LSTM with attention (81.51%) and the Bi-LSTM (80.97%).

Original languageEnglish
Title of host publication2019 6th International Conference on Social Networks Analysis, Management and Security, SNAMS 2019
EditorsMohammad Alsmirat, Yaser Jararweh
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages466-471
Number of pages6
ISBN (Electronic)9781728129464
DOIs
Publication statusPublished - Oct 2019
Event6th International Conference on Social Networks Analysis, Management and Security, SNAMS 2019 - Granada, Spain
Duration: 22 Oct 201925 Oct 2019

Publication series

Name2019 6th International Conference on Social Networks Analysis, Management and Security, SNAMS 2019

Conference

Conference6th International Conference on Social Networks Analysis, Management and Security, SNAMS 2019
Country/TerritorySpain
CityGranada
Period22/10/1925/10/19

Keywords

  • Arabic language
  • attention model
  • convolutional neural network
  • deep learning
  • long short-term memory
  • offensive language detection
  • social media

Fingerprint

Dive into the research topics of 'Detecting Offensive Language on Arabic Social Media Using Deep Learning'. Together they form a unique fingerprint.

Cite this