Ok-NB: An Enhanced OPTICS and k-Naive Bayes Classifier for Imbalance Classification with Overlapping

Zahid Ahmed, Biju Issac, Sufal Das*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
34 Downloads (Pure)

Abstract

Class imbalance problems have received a lot of attention throughout the last few years. It poses considerable hurdles to conventional classifiers, especially when combined with overlapping instances, where the complexity of the classification task increases. In this study, we have proposed a novel density-based method that combines the Ordering Points To Identify the Clustering Structure (OPTICS) algorithm with the Naive-Bayes approach to effectively handle overlapped and imbalanced problems at the same time, known as OPTICS-based k-Naive Bayes (Ok-NB). The Ok-NB method is used to correctly identify and construct the training data into overlapping and non-overlapping groups based on their density and reachability, while the Naive-Bayes technique is used to correctly map the test data samples to the appropriate class for accurate output. It offers adaptability and reliability in classifying complex datasets with overlapping and imbalanced properties. Cluster-based proximity assessment and probabilistic classification are combined to improve classification accuracy and guarantee that the most reliable neighbours’ opinions are given the greatest weight during the decision-making process. Extensive experiments were conducted on 21 benchmark datasets and the experiment results demonstrate how effectively the suggested approach works to achieve high classification accuracy. This proves the effectiveness and superiority of this proposed approach compared to existing state-of-the-art methods in tackling overlapping and imbalance challenges in classification tasks.
Original languageEnglish
Pages (from-to)57458 - 57477
Number of pages20
JournalIEEE Access
Volume12
DOIs
Publication statusPublished - 22 Apr 2024

Keywords

  • Bayes methods
  • Classification
  • Classification algorithms
  • Clustering algorithms
  • Imbalanced Data
  • Naive-Bayes
  • Noise measurement
  • OPTICS
  • Overlapped Data
  • Sensitivity
  • Support vector machines
  • Training

Cite this