To Remove or not to Remove: the Impact of Outlier Handling on Significance Testing in Testosterone Data

Thomas Pollet, Leander van der Meij

Research output: Contribution to journalArticlepeer-review

64 Citations (Scopus)
24 Downloads (Pure)

Abstract

Outlier removal is common in hormonal research. Here we investigated to what extent removing outliers in hormonal data leads to divergent statistical conclusions. We first show that the most common outlier detection rule is based on a number of standard deviations (SD) from the mean. Next, we used simulations to examine the degree to which statistical conclusions diverge when a test with outlier exclusion yields a statistically significant result whereas the test with outlier inclusion did not, or vice versa (at p = .05). Simulations were run in duplicate for independent samples t-tests and repeated measures ANOVA designs, and based on real testosterone (T) data and a theoretical gamma distribution of T data. We ran simulations for different sample sizes (30 to 100) and outlier removal rules (2.5 SD and 3 SD). For significant t-tests, we found that in between 14 % to 55 % of the significant cases a test with outlier exclusion yielded a statistically significant result whereas the test with outlier inclusion did not, or vice versa (median p difference: .03–.06). For significant repeated measures ANOVAs, we found that in between 7 % to 28 % of significant cases a test where outlier exclusion yielded a statistically significant result whereas the test with outlier inclusion did not, or vice versa (median p difference: .01–.03). When reporting any test that would lead to a statistically significant result (either the test with inclusion or exclusion of outliers (or both)), in between 5.15 % and 6.89 % of the independent sample t-tests were statistically significant, and for the repeated measures ANOVA design this was between 6.32 % and 7.62 % of the tests. Our results suggest that outlier handling can have a substantial impact on significance testing. We suggest several potential solutions for handling outliers and we argue for a careful assessment of handling outliers in hormonal data.
Original languageEnglish
Pages (from-to)43-60
JournalAdaptive Human Behavior and Physiology
Volume3
Issue number1
Early online date29 Aug 2016
DOIs
Publication statusPublished - Mar 2017

Keywords

  • Sex hormones
  • Statistical design
  • p value
  • Outlier handling
  • Statistical simulation

Fingerprint

Dive into the research topics of 'To Remove or not to Remove: the Impact of Outlier Handling on Significance Testing in Testosterone Data'. Together they form a unique fingerprint.

Cite this