Transformer-based Text Classification on Unified Bangla Multi-class Emotion Corpus

Md Sakib Ullah Sourav, Mohammad Sultan Mahmud, Hua Zheng, Mohammad Aljaidi, Md. Simul Hasan Talukder, Rejwan Bin Sulaiman, Abdullah Hafez Nur, Ahmad Al-Qerem

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Due to its importance in studying people’s thoughts on various Web 2.0 services, emotion classification is a critical undertaking. Most existing research is focused on the English language, with little work on low-resource languages. Though sentiment analysis, particularly emotion classification in English, has received increasing attention in recent years, little study has been done in the context of Bangla, one of the world’s most widely spoken languages. In this research, we propose a complete set of approaches for identifying and extracting emotions from Bangla texts. We provide a Bangla emotion classifier for six classes, i.e., anger, disgust, fear, joy, sadness, and surprise, from Bangla words using transformer-based models, which exhibit phenomenal results in recent days, especially for high-resource languages. The Unified Bangla Multi-class Emotion Corpus (UBMEC) is used to assess the performance of our models. UBMEC is created by combining two previously released manually labelled datasets of Bangla comments on six emotion classes with fresh manually labelled Bangla comments created by us. The corpus dataset and code we used in this work are publicly available.
Original languageEnglish
Title of host publication2024 25th International Arab Conference on Information Technology (ACIT)
Place of PublicationPiscataway, US
PublisherIEEE
Pages1-7
Number of pages7
ISBN (Electronic)9798331540012
ISBN (Print)9798331540029
DOIs
Publication statusPublished - 10 Dec 2024
Externally publishedYes
Event2024 25th International Arab Conference on Information Technology (ACIT) - Zarqa University, Zarqa, Jordan
Duration: 10 Dec 202412 Dec 2024
https://acit2k.org/ACIT/

Publication series

NameInternational Arab Conference on Information Technology (ACIT)
PublisherIEEE
ISSN (Print)2831-493X
ISSN (Electronic)2831-4948

Conference

Conference2024 25th International Arab Conference on Information Technology (ACIT)
Abbreviated titleACIT'2024
Country/TerritoryJordan
CityZarqa
Period10/12/2412/12/24
Internet address

Keywords

  • Bangla corpus
  • Bangla emotion analysis
  • Text classification
  • Multi-class emotion classification
  • Natural language processing

Cite this