An explainable artificial intelligence model for detecting xenophobic tweets
Export citation
Abstract
Xenophobia is hate speech characterized by hatred, fear, or rejection of people from other communities. The growth of the internet worldwide has resulted in the rapid expansion in the use of social networks. The excessive use of social networks has led to hate speech, primarily due to the feeling of pseudo-anonymity that social networks provide. On occasions, the violent behavior present in the violent courses of social networks breaks the barriers of the internet and becomes an act of physical violence in real life. Research on the classification of xenophobia in social networks is a very recent problem, and that is why there are currently very few databases available for the classification of xenophobia. That is why we created a new Twitter xenophobia database, whose main feature is to have been labeled by experts in international relations, psychology, and sociology. This database has 10,073 manually tagged Tweets, of which 2,017 belong to the xenophobia class. An extensive effort is currently being made to migrate the unexplained machine learning classifiers known as black-box to new explainable artificial intelligence (XAI) models that allow the interpretability and understanding of the classification. We decided to introduce an XAI model based on contrast patterns jointly with a new interpretable feature representation based on syntactic, semantic, and sentiment analysis to understand the characteristics of xenophobic posts on social networks. The new interpretable feature representation has 38 different characteristics, including information on feelings, emotions, intentions, syntactic characteristics, and keywords related to xenophobia. Finally, our results show that our new feature representation in conjunction with a classifier based on contrast patterns obtained an average of 0.86 and 0.77 points in AUC and F1 scores, respectively. Experiments show that XAI models can achieve classification results equal to or better than unexplained models. Furthermore, creating a new interpretable feature representation based on emotions, feelings, intentions, and keywords related to xenophobia allowed us to extract a set of the most used words in xenophobic posts. The interpretable feature representation, jointly with an XAI contrast pattern-based model, allowed us to extract a set of patterns describing the xenophobic and non-xenophobic classes. These patterns are presented in a language close to the experts and contextualize words associated with xenophobia using emotions, intentions, and feelings.