Clustering asymmetrical data with outliers: Parsimonious mixtures of contaminated mean-mixture of normal distributions

Mehrdad Naderi, Mehdi Jabbari Nooghabi*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Mixture modeling has emerged as a statistical tool to perform unsupervised model-based clustering for heterogeneous data. A framework of using contaminated mean-mixture of normal distributions as the components of the mixture model is designed to accommodate asymmetric data with outliers. Fourteen parsimonious variants of the postulated model are introduced by employing an eigenvalue decomposition of the component scale matrices. Simultaneously clustering and outliers detection is an outstanding advantage of the proposed model in analyzing non-normally distributed data. A computationally feasible and flexible EM-type algorithm is outlined for obtaining maximum likelihood parameter estimates. Moreover, the score vector and empirical information matrix for calculating asymptotic standard errors of the parameter estimates are derived by offering an information-based approach. The applicability of the proposed method is demonstrated through the analysis of simulated and real datasets with varying proportions of outliers.

Original languageEnglish
Article number115433
Number of pages18
JournalJournal of Computational and Applied Mathematics
Volume437
Early online date13 Jul 2023
DOIs
Publication statusPublished - 1 Feb 2024
Externally publishedYes

Keywords

  • Contaminated mean-mixture of normal distributions
  • Eigenvalue decomposition
  • EM-type algorithm
  • Finite mixture model
  • Outliers detection

Cite this