Skip to main content
Dryad logo

R code and example data for using genogeographic clustering approach

Citation

Arranz, Vanessa (2022), R code and example data for using genogeographic clustering approach, Dryad, Dataset, https://doi.org/10.5061/dryad.b2rbnzsbr

Abstract

While in recent years there have been considerable advances in discerning spatial genetic patterns within species, the task of identifying common patterns across species is still challenging. Approaches using new data from co-sampled species permit rigorous statistical analysis but are often limited to a small number of species; meta-analyses of published data can encompass a much broader range of species, but are usually restricted by uneven data properties. There is a need for new approaches that bring greater statistical rigour to meta-analyses, and are also able to discern more than a single spatial pattern among species.

We propose a new approach for comparative multi-species meta-analyses of published population genetic data that addresses many existing limitations. This analysis takes a three-stage approach: (i) use common genetic metrics to measure location-specific diversity across the sampled range of each species, (ii) use an innovative graphing technique to describe spatial patterns within each species, and (iii) quantitatively cluster species by their similarity in pattern. We apply this technique to 21 species of intertidal invertebrate from the New Zealand coastline, to resolve common spatial patterns from disparate profiles of genetic diversity.

The genogeographic curves are shown to successfully capture the known spatial patterns within each intertidal species, and readily permit statistical comparison of those patterns, regardless of sampling and marker inconsistencies. The species clustering technique is shown to discern groups of species that clearly share spatial patterns within groups but differ significantly among groups. The species groups defined were not identifiable a priori from their taxonomy or life history, but their spatial genetic patterns appear biologically relevant.

Genogeographic species clustering provides a novel approach to discerning multiple common spatial patterns of diversity among a large number of species. It will permit more rigorous comparative studies from diverse published data, and can be easily extended to a wide variety of alternative measures of genetic diversity or divergence. We see the approach best used as an exploratory method, to uncover the patterns often hidden in multi-species communities, likely to be followed by more targeted model-testing analyses.

Methods

This R code and example data file is provided to use "Genogeographic clustering approach" to discerning multiple common spatial patterns of diversity among a large number of species. It will permit more rigorous comparative studies from diverse published data, and can be easily extended to a wide variety of alternative measures of genetic diversity or divergence. We see the approach best used as an exploratory method, to uncover the patterns often hidden in multi-species communities, likely to be followed by more targeted model-testing analyses.

The dataset was built from previously published data of single-species population genetics studies. For each species, a genetic diversity measure was computed within each geographic location. The measure of genetic diversity used depended on the genetic marker available. Haplotype diversity “H” was used for mitochondrial DNA, and the analogous allelic expected heterozygosity “He” for nuclear DNA microsatellites. Collectively, these values are henceforth referred to simply as allelic diversity, “H”. Values were either reported directly in the publications used, or alternatively calculated using ARLEQUIN v 3.5 (Excoffier & Lischer 2010) and GeneAlEx (Peakall & Smouse 2012) from the genetic data associated with the publication. Adjustments for small sample sizes were included in the calculation of all parameters, using unbiased estimators (Nei 1972).

Usage Notes

All the R code and instructions to perform the analyses described in the following paper: Arranz, Vanessa, Rachel M. Fewster, and Shane D. Lavery. "Genogeographic clustering to identify cross‐species concordance of spatial genetic patterns." Diversity and Distributions (2022) https://doi.org/10.1111/ddi.13474 .

Information and instructions can be found in the README.txt file.

Funding

Beate Schular Doctoral Scholarship