Data from: Spatially structured statistical network models for landscape genetics

Peterson, Erin E.1; Hanks, Ephraim M.2; Hooten, Mevin B.3; Ver Hoef, Jay M.4; Fortin, Marie-Josée5

Published Dec 12, 2018 on Dryad. https://doi.org/10.5061/dryad.m0h05rt

Data files

Dec 12, 2018 version files 229.44 KB

AppendixS2.pdf

180.70 KB
DataS1.zip

45.03 KB
DataS2.zip

3.71 KB

Abstract

A basic understanding of how the landscape impedes, or creates resistance to, the dispersal of organisms and hence gene flow is paramount for successful conservation science and management. Spatially structured ecological networks are often used to represent spatial landscape-genetic relationships, where nodes represent individuals or populations and resistance to movement is represented using non-binary edge weights. Weights are typically assigned or estimated by the user, rather than observed, and validating such weights is challenging. We provide a synthesis of current methods used to estimate edge weights and an overview of common model types, stressing the advantages and disadvantages of each approach and their ability to model landscape-genetic data. We further explore a set of spatial-statistical methods that provide ecologists with alternative approaches for modeling spatially explicit processes that may affect genetic structure. This includes an overview of spatial autoregressive models, with a particular focus on how correlation and partial correlation are used to represent neighborhood structure with the inverse of the covariance matrix (i.e., precision matrix). We then demonstrate how to model resistance by specifying an appropriate statistical model on the nodes, conditioned on the edge weights, through the precision matrix. This integration of network ecology and spatial statistics provides a practical analytical framework for landscape-genetic studies. The results can be used to make statistical inferences about the relative importance of individual landscape characteristics, such as the vegetative cover, hillslope, or the presence of roads or rivers, on gene flow. In addition, the R code we include allows readers to explore landscape-genetic structure in their own datasets, which will potentially provide new insights into the evolutionary processes that generated ecological networks, as well as valuable information about the optimal characteristics of conservation corridors.

DataS1: Simulated data and custom modelling functions

The two files in DataS1 are used within Appendix S2 to create the example in the manuscript. 1) modelling_functions.R: Custom functions used to estimate edge weights within an intrinsic conditional autoregressive (ICAR) model by incorporating resistance covariates into the off-diagonal elements of the precision matrix. These functions are used with the rwc package for R. 2) SimData.Rdata: The Rdata file contains three simulated genetic datasets based on resistance matrices generated under isolation by distance (popsIBD), isolation by resistance (popsIBR), and isolation by barrier (popsIBB) hypotheses. The data were simulated using the PopGenReport package. The simulation was based on 20 loci with 20 alleles, a sex ratio of 0.5, migration rate of 0.05, dispersal rate of 0.05, and a mutation rate of 0.001. However, the maximum dispersal distance (i.e., disp.max) varied depending on the simulation. The simulations were run for 400 generations and the carrying capacity for each of the 30 sub-populations was set to 15. The R code used to generate these datasets can be found in Appendix S2. The R code used to fit the models is provided in Appendix S2.

DataS1.zip

dataS2: Fitted models

The file in DataS2 is used within Appendix S2 to create the example in the manuscript. 1. DataS2.Rdata: This file contains 9 ICAR models fit to simulated genetic distance matrices and resistance matrices using a generalized Wishart distribution. The simulated genetic distance matrices and resistance matrices were based on the isolation by distance (IBD), isolation by resistance (IBR), and isolation by barrier (IBB) hypotheses. There are 9 models in total (3 simulated genetic distances x 3 resistance types = 9 models). The naming convention for the models is: • IBD.IBD = IBD genetic distance matrix and IBD resistance matrix • IBD.IBR = IBD genetic distance matrix and IBR resistance matrix • IBD.IBB = IBD genetic distance matrix and IBB resistance matrix • IBR.IBD = IBR genetic distance matrix and IBD resistance matrix • IBR.IBR = IBR genetic distance matrix and IBR resistance matrix • IBR.IBB = IBR genetic distance matrix and IBB resistance matrix • IBB.IBD = IBB genetic distance matrix and IBD resistance matrix • IBB.IBR = IBB genetic distance matrix and IBR resistance matrix • IBB.IBB = IBB genetic distance matrix and IBB resistance matrix The R code used to fit the models is provided in Appendix S2.

DataS2.zip

AppendixS2: Modelling tutorial

In this example we demonstrate how edge weights can be estimated within an intrinsic conditional autoregressive (ICAR) model by incorporating resistance covariates into the off-diagonal elements of the precision matrix. The example is based on genetic simulations generated using the PopGenReport package and implemented using R statistical software version 3.4.4. Resistance distances are generated using the gdistance package and the models are fit using the rwc package; all of which are available on the CRAN website (https://cran.r-project.org/). R code has been provided in AppendixS2.pdf and in dataS1.zip so that the methods are more accessible to readers, who can recreate the example provided in the manuscript and apply these methods to their own landscape genetic datasets.

AppendixS2.pdf