Providing a robust sentiment analysis model evaluation based on extrinsic and intrinsic metrics
Citation
Share
Abstract
In past years, the world has been facing the COVID-19 pandemic. The pandemic has repercussions on several fronts, including mortality rates and declining physical health, but also in social, financial, and practically every area of life. It has led to different countries taking different mitigation measures, including both clinical and non-clinical interventions, either of which presents significant challenges to mental, emotional, and physical health and their respective programs. It is difficult for governmental or public organizations to incorporate mental and emotional health feedback into their decision-making processes, for just taking measurements is complex. To this end, a collection of $760,064,879$ public domain tweets were analyzed using several open sentiment analysis tools to investigate the collective emotional state of the epidemic during its development, news cycles, and the impact of government statements and actions, and to offer extrinsic measurements of the success of these sentiment analysis techniques. This research aims to evaluate several language models robustly by utilizing both intrinsic and extrinsic evaluation metrics. The extrinsic evaluation to be performed is a large-scale sentiment analysis study of COVID-19-related tweets generated in Mexico during 2020, which was the first year of the pandemic. Time series analysis and other descriptive statistics are then utilized to understand the emotional response to the pandemic better, doing so with state-of-the-art language models and providing a performance comparison with each other.
Description
https://orcid.org/0000-0001-6623-1758