Skip to main content
Dryad

Data from: Incomplete specimens in geometric morphometric analyses

Data files

Oct 11, 2014 version files 58.84 KB

Abstract

1.The analysis of morphological diversity frequently relies on the use of multivariate methods for characterizing biological shape. However, many of these methods are intolerant of missing data, which can limit the use of rare taxa and hinder the study of broad patterns of ecological diversity and morphological evolution. This study applied a mutli-dataset approach to compare variation in missing data estimation and its effect on geometric morphometric analysis across taxonomically-variable groups, landmark position and sample sizes. 2.Missing morphometric landmark data was simulated from five real, complete datasets, including modern fish, primates and extinct theropod dinosaurs. Missing landmarks were then estimated using several standard approaches and a geometric-morphometric-specific method. The accuracy of missing data estimation was determined for each estimation method, landmark position, and morphological dataset. Procrustes superimposition was used to compare the eigenvectors and principal component scores of a geometric morphometric analysis of the original landmark data, to datasets with A) missing values estimated, or B) simulated incomplete specimens excluded, for varying levels of specimens incompleteness and sample sizes. 3.Standard estimation techniques were more reliable estimators and had lower impacts on morphometric analysis compared to a geometric-morphometric-specific estimator. For most datasets and estimation techniques, estimating missing data produced a better fit to the structure of the original data than exclusion of incomplete specimens, and this was maintained even at considerably reduced sample sizes. The impact of missing data on geometric morphometric analysis was disproportionately affected by the most fragmentary specimens. 4.Missing data estimation was influenced by variability of specific anatomical features, and may be improved by a better understanding of shape variation present in a dataset. Our results suggest that the inclusion of incomplete specimens through the use of effective missing data estimators better reflects the patterns of shape variation within a dataset than using only complete specimens, however the effectiveness of missing data estimation can be maximized by excluding only the most incomplete specimens. It is advised that missing data estimators be evaluated for each dataset and landmark independently, as the effectiveness of estimators can vary strongly and unpredictably between different taxa and structures.