Are geometric morphometric analyses replicable? Evaluating landmark measurement error and its impact on extant and fossil Microtus classification

Fox, Nathaniel 1 ; Veneracion, Joseph1 ; Blois, Jessica 1

Published Mar 26, 2020 on Dryad. https://doi.org/10.6071/M3KD40

Data files

Mar 17, 2020 version files 4.02 GB

MicrotusSupplemental.zip

4.02 GB

Mar 26, 2020 version files 4.02 GB

MicrotusSupplemental.zip

4.02 GB
Supplemental_NoImages.zip

1.75 MB

Abstract

Geometric morphometric analyses are frequently employed to quantify biological shape and shape variation. Despite the popularity of this technique, quantification of measurement error in geometric morphometric datasets and its impact on statistical results is seldom assessed in the literature. Here, we evaluate error on 2D landmark coordinate configurations of the lower first molar of five North American Microtus (vole) species. We acquired data from the same specimens several times to quantify error from four data acquisition sources: specimen presentation, imaging device, interobserver variation, and intraobserver variation. We then evaluated the impact of those errors on linear discriminant analysis-based classifications of the five species using recent specimens of known species affinity and fossil specimens of unknown species affinity. Results indicate that data acquisition error can be substantial, sometimes explaining >30% of the total variation among datasets. Comparisons of datasets digitized by different individuals exhibit the greatest discrepancies in landmark precision and comparison of datasets photographed from different presentation angles yield the greatest discrepancies in species classification results. All error sources impact statistical classification to some extent. For example, no two landmark dataset replicates exhibit the same predicted group memberships of recent or fossil specimens. Our findings emphasize the need to mitigate error as much as possible during geometric morphometric data collection. Though the impact of measurement error on statistical fidelity is likely analysis-specific, we recommend that all geometric morphometric studies standardize specimen imaging equipment, specimen presentations (if analyses are 2D), and landmark digitizers to reduce error and subsequent analytical misinterpretations.

The following methodological description is adapted from the "Study design", "Data preparation", and "Quantifying measurement error impacts on classification statistics" methods subsections of the associated manuscript (Fox et al. in press):

We replicated 2D digital specimen images (n=247) and m1 landmark configurations (n=21 landmarks) of (McGuire, 2011) to quantify measurement error from four data acquisition sources and its impact on Microtus species classification. All photographed specimens are from the University of California Museum of Vertebrate Zoology (MVZ); see Appendix I of (McGuire, 2011) for a list of the recent Microtus specimens included. We were unable to acquire four of the 251 original specimens from McGuire (2011) (MVZ: 68521, 83519, 96735, 99283) so the final number of individuals analyzed per species is as follows: M. californicus (n=49), M. longicaudus (n=49), M. montanus (n=48), M. oregoni (n=50), M. townsendii (n=51). Each phase of landmark data acquisition (i.e., specimen presentation, specimen imaging, and inter/intra observer digitization) was repeated to quantify error from those sources. Our study design for quantifying error from each source was as follows:

Imaging device - We assembled two datasets using specimen images obtained from two different cameras to evaluate inter-instrument variation (hereafter “imaging device” or simply “device” variation). The first image set included the original Microtus dentary images photographed with a Nikon D70s (hereafter Nikon) from (McGuire, 2011). The second image set included the same specimens photographed with a Dino-Lite Edge AM4815ZTL Digital Microscope (hereafter Dino-Lite). Efforts were made to replicate the original Nikon specimen orientations, especially projected angles of occlusal tooth surfaces and specimen distances from the camera lens, to minimize presentation error during this iteration. However, presentation error is necessarily a residual component of imaging device error in 2D systems no matter what measures are taken to control it.

Specimen presentation - After an initial Dino-Lite photograph was taken, each Microtus specimen was tilted haphazardly along its anteroposterior and/or labiolingual axis and re-photographed with all landmark loci still visible. This was done to simulate specimen orientation changes that may occur when comparing dissimilar specimens such as in situ teeth and isolated teeth. That scenario is not uncommon when comparing fossil specimens to recent specimens since complete preservation of fossilized craniodental remains is rare. When loose m1s were available from recent Microtus specimens, those teeth were photographed in isolation rather than in situ during this iteration. We note, however, that intentionally tilting specimens potentially exacerbates presentation error relative to the amount of error typically introduced when specimen orientations are standardized. The intent of this modification is to quantify potential presentation error rather than expected error since presentation error will vary by study (Fruciano, 2016).

Inter/intra observer error - To quantify observer variation, the original Nikon Microtus m1 images and Dino-Lite resampled images were digitized by two observers using the 21-landmark protocol of Wallace (2006) and McGuire (2011). Those observers allowed us to evaluate methodological experience since one observer, hereafter referred to as the experienced observer (EO), had previous experience conducting 2D landmark analyses at the time this study was initiated while the other observer, hereafter referred to as the new observer (NO), did not. Each image set was then digitized a second time by the EO and NO with at least one week between iterations to evaluate intraobserver variation on landmark placement.

Nine unique landmark datasets were assembled in total to evaluate measurement error from the four focal data acquisition sources. First, Nikon and Dino-Lite image sets were assembled to quantify imaging device variation. Those image sets were digitized twice by each observer to evaluate inter and intraobserver error (two image sets and two digitizing iterations per observer = eight datasets). A “tilted” Dino-Lite image set was then assembled and digitized by the EO to quantify data variation due to changes in specimen presentation resulting in a total of nine datasets. All image sets were assembled and digitized using TpsUtil 32 (Rohlf, 2018a) and TpsDig 2.32 (Rohlf, 2018b) software respectively. Each landmark dataset was superimposed via Generalized Procrustes Analysis (GPA) to standardize effects of rotation, orientation, and scale among specimens using the gpagen function in the R package “geomorph”(version 3.1.3, Adams et al., 2019). During GPA, all specimens are translated to the origin, scaled to unit-centroid size, and optimally rotated via a generalized least-squares algorithm to align them along a common coordinate system (Rohlf and Slice, 1990).

To determine how source-specific measurement error impacts Microtus species classification, we ran linear discrimant analyses on each of the nine GPA-transformed landmark datasets using the lda function in the R package “MASS” (version 7.3, Venables and Ripley, 2002). Forty-two x, y coordinates from the 21 digitized landmarks were used as predictor variables to classify each specimen into a predicted species group. We used leave-one-out cross-validation to determine the percentage of specimens correctly classified within their respective species groups since it reduces standard LDA-group overfitting (Kovarovic et al., 2011). Prior probabilities of group membership were assigned using the default lda argument based on the proportion of group samples which, in this case, are nearly equal due to similar sample sizes among species. Linear discriminant analysis predicted group membership (PGM) error percentages were calculated for each landmark dataset by dividing the number of misclassified individuals across all five species by the total number of individuals (n=247) multiplied by 100.

Next, a set of 31 fossil Microtus m1 images of unknown species identity was digitized by the EO, using the same 21-landmark protocol, and appended to each dataset of recent Microtus specimens to evaluate error impacts on the PGM of unknown specimens. Fossil specimens included mostly isolated m1s and were photographed with the same Dino-Lite camera as recent Microtus specimens. Each of the nine recent Microtus landmark datasets served as a unique discriminant function training set to classify the unknown fossils into one or more of the five extant species groups. All fossil specimens are from Project 23, Deposit 1, at Rancho La Brea in Los Angeles, CA and are late Pleistocene in age (~46,000 to ~31,000 radiocarbon years before present (Fox et al., 2019; Fuller et al., 2020)). Due to their geographic and temporal location, it is unlikely that the fossils belong to a species of Microtus other than the five included in our LDA training sets. Linear discriminant analyses were run on landmark coordinate variables of each dataset with fossils entered as unknowns.

References:

Adams, D.C., Collyer, M.L., Kaliontzopoulou, A., 2019. Geomorph: Software for geometric morphometric analyses. R package version 3.1.0.

Fox, N.S., Takeuchi, G.T., Farrell, A.B., Blois, J.L., 2019. A protocol for differentiating late Quaternary leporids in southern California with remarks on Project 23 lagomorphs at Rancho La Brea, Los Angeles, California, USA. PaleoBios 36, 1–20.

Fox, N.S., Veneracion, J.J., Blois, J.L. (in press). Are geometric morphometric analyses replicable? Evaluating landmark measurement error and its impact on extant and fossil Microtus classification. Ecology and Evolution.

Fruciano, C., 2016. Measurement error in geometric morphometrics. Development Genes and Evolution 226, 139–158. https://doi.org/10.1007/s00427-016-0537-4

Fuller, B.T., Southon, J.R., Fahrni, S.M., Farrell, A.B., Takeuchi, G.T., Nehlich, O., Guiry, E.J., Richards, M.P., Lindsey, E.L., Harris, J.M., 2020. Pleistocene paleoecology and feeding behavior of terrestrial vertebrates recorded in a pre-LGM asphaltic deposit at Rancho La Brea, California. Palaeogeography, Palaeoclimatology, Palaeoecology 537, 109383.

Kovarovic, K., Aiello, L.C., Cardini, A., Lockwood, C.A., 2011. Discriminant function analyses in archaeology: are classification rates too good to be true? Journal of Archaeological Science 38, 3006–3018. https://doi.org/10.1016/j.jas.2011.06.028

McGuire, J.L., 2011. Identifying California Microtus species using geometric morphometrics documents Quaternary geographic range contractions. Journal of Mammalogy 92, 1383–1394. https://doi.org/10.1644/10-MAMM-A-280.1

Rohlf, F.J., 2018a. TpsUtil version 1.76. Ecology & Evolution: (program), New York: Suny at Stony Brook.

Rohlf, F.J., 2018b. TpsDig version 2.31. Ecology & Evolution: (program), New York: Suny at Stony Brook.

Rohlf, F.J., Slice, D., 1990. Extensions of the procrustes method for the optimal superimposition of landmarks. Systematic Zoology 39, 40–59. https://doi.org/10.2307/2992207

Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S, Fourth. ed. Springer, New York.

Wallace, S.C., 2006. Differentiating Microtus xanthognathus and Microtus pennsylvanicus Lower First Molars Using Discriminant Analysis of Landmark Data. Journal of Mammalogy 87, 1261–1269.

Are geometric morphometric analyses replicable? Evaluating landmark measurement error and its impact on extant and fossil Microtus classification

Data files

Abstract

Methods

Usage notes

Works referencing this dataset