Skip to main content

Data from: Using multiple imputation to estimate missing data in meta-regression

Cite this dataset

Ellington, E. Hance et al. (2015). Data from: Using multiple imputation to estimate missing data in meta-regression [Dataset]. Dryad.


1. There is a growing need for scientific synthesis in ecology and evolution. In many cases, meta-analytic techniques can be used to complement such synthesis. However, missing data is a serious problem for any synthetic efforts and can compromise the integrity of meta-analyses in these and other disciplines. Currently, the prevalence of missing data in meta-analytic datasets in ecology and the efficacy of different remedies for this problem have not been adequately quantified. 2. We generated meta-analytic datasets based on literature reviews of experimental and observational data and found that missing data were prevalent in meta-analytic ecological datasets. We then tested the performance of complete case removal (a widely used method when data are missing) and multiple imputation (an alternative method for data recovery) and assessed model bias, precision, and multi-model rankings under a variety of simulated conditions using published meta-regression datasets. 3. We found that complete case removal led to biased and imprecise coefficient estimates and yielded poorly specified models. In contrast, multiple imputation provided unbiased parameter estimates with only a small loss in precision. The performance of multiple imputation, however, was dependent on the type of data missing. It performed best when missing values were weighting variables, but performance was mixed when missing values were predictor variables. Multiple imputation performed poorly when imputing raw data which was then used to calculate effect size and the weighting variable. 4. We conclude that complete case removal should not be used in meta-regression, and that multiple imputation has the potential to be an indispensable tool for meta-regression in ecology and evolution. However, we recommend that users assess the performance of multiple imputation by simulating missing data on a subset of their data before implementing it to recover actual missing data.

Usage notes