Detection of Racism on Multilingual Social Media: An NLP Approach

Ikram El Miqdadi, Jamal Kharroubi, Nikola S. Nikolov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a comparison between various text vectorization and machine learning algorithms for solving the problem of detection of racism on multi-lingual social media. We train classification models on Facebook comments and tweets in three different languages: English, French and Arabic. Our findings suggest that for the English-language comments, the combination of KNN with TF-IDF works best with an accuracy of 78.34%, while for French, the use of the SVM classifier with BOW provides an accuracy of 82.56%. For Arabic we obtain an accuracy of 91.13% when KNN is coupled with BOW. Overall, our results suggest that the combination of SVM and TF-IDF is the best choice for detection of racism on social media that contains content in English, French and Arabic at the same time. As part of this work, we also present a new annotated dataset of social media comments in three languages.

Original languageEnglish
Title of host publicationInformation Systems and Technologies - WorldCIST 2023
EditorsAlvaro Rocha, Hojjat Adeli, Gintautas Dzemyda, Fernando Moreira, Valentina Colla
PublisherSpringer Science and Business Media Deutschland GmbH
Pages436-445
Number of pages10
ISBN (Print)9783031456411
DOIs
Publication statusPublished - 2024
Event11th World Conference on Information Systems and Technologies, WorldCIST 2023 - Pisa, Italy
Duration: 4 Apr 20236 Apr 2023

Publication series

NameLecture Notes in Networks and Systems
Volume799 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference11th World Conference on Information Systems and Technologies, WorldCIST 2023
Country/TerritoryItaly
CityPisa
Period4/04/236/04/23

Keywords

  • detection of racism
  • machine learning
  • NLP

Fingerprint

Dive into the research topics of 'Detection of Racism on Multilingual Social Media: An NLP Approach'. Together they form a unique fingerprint.

Cite this