Researcher degrees of freedom in statistical software contribute to unreliable results: A comparison of nonparametric analyses conducted in SPSS, SAS, Stata, and R
Document Type
Article
Publication Date
9-1-2023
Keywords
Nonparametric procedures, Reproducibility, Researcher degrees of freedom, Statistical conclusion validity, Statistical software
Abstract
Researcher degrees of freedom can affect the results of hypothesis tests and consequently, the conclusions drawn from the data. Previous research has documented variability in accuracy, speed, and documentation of output across various statistical software packages. In the current investigation, we conducted Pearson’s chi-square test of independence, Spearman’s rank-ordered correlation, Kruskal–Wallis one-way analysis of variance, Wilcoxon Mann–Whitney U rank-sum tests, and Wilcoxon signed-rank tests, along with estimates of skewness and kurtosis, on large, medium, and small samples of real and simulated data in SPSS, SAS, Stata, and R and compared the results with those obtained through hand calculation using the raw computational formulas. Multiple inconsistencies were found in the results produced between statistical packages due to algorithmic variation, computational error, and statistical output. The most notable inconsistencies were due to algorithmic variations in the computation of Pearson’s chi-square test conducted on 2 × 2 tables, where differences in p-values reported by different software packages ranged from.005 to.162, largely as a function of sample size. We discuss how such inconsistencies may influence the conclusions drawn from the results of statistical analyses depending on the statistical software used, and we urge researchers to analyze their data across multiple packages to check for inconsistencies and report details regarding the statistical procedure used for data analysis.
Journal Title
Behavior Research Methods
Volume
55
Issue
6
First Page
2813
Last Page
2837
DOI
https://doi.org/10.3758/s13428-022-01932-2
Recommended Citation
Hodges, Cooper B.; Stone, Bryant M.; Johnson, Paula K.; Carter, James H.; Sawyers, Chelsea K.; Roby, Patricia R.; and Lindsey, Hannah M., "Researcher degrees of freedom in statistical software contribute to unreliable results: A comparison of nonparametric analyses conducted in SPSS, SAS, Stata, and R" (2023). Faculty Publications. 4936.
https://digitalcommons.andrews.edu/pubs/4936