Skip to main content
Dryad

Data from: Local adaptation of Pinus leiophylla under climate and land use change models in the Avocado Belt of Michoacán

Cite this dataset

Izaguirre-Toriz, Vanessa; González-Rodríguez, Antonio (2024). Data from: Local adaptation of Pinus leiophylla under climate and land use change models in the Avocado Belt of Michoacán [Dataset]. Dryad. https://doi.org/10.5061/dryad.gxd2547t6

Abstract

Climate change and land use change are two main drivers of global biodiversity decline, decreasing the amount of genetic diversity that populations harbor and altering the patterns of local adaptation.  Methods in landscape genomics allow measuring the effect of these anthropogenic disturbances on the adaptation of populations. However, both factors have rarely been considered simultaneously. We modeled the spatial turnover in allele frequencies of 19 localities of Pinus leiophylla across the Avocado Belt in Michoacán state, Mexico which could change under climate change and land use change scenarios, in addition to evaluating assisted gene flow strategies and connectivity metrics across the landscape to identify priority conservation areas. We found that localities at the center-east regions would be more vulnerable to climate change, while localities in the west area will be more threatened by actions of land use change. However, assisted gene flow actions could reduce their risk of extinction for both scenarios. Connectivity patterns will also be modified by future habitat loss, with the central and eastern parts having the highest connectivity values. These results show that the areas with the highest priority for conservation are in the eastern zones, which include the Monarch Butterfly Biosphere Reserve. This work is useful as a framework that incorporates distinct layers of information to provide a robust representation of the response of populations to future anthropogenic disturbances.

README: Data from: Local adaptation of Pinus leiophylla under climate and land use change models in the Avocado Belt of Michoacán

Pinus_leiophylla_landscape_genomics

https://doi.org/10.5061/dryad.gxd2547t6

This is a .vcf of 77 individuals from 19 Pinus leiophylla locations collected throughout the avocado belt in the state of Michoacán, México. We used the ddRAD-Seq protocol and sequenced using Illumina NovaSeq 6000.

Description of the data and file structure

ld_relaxed_90k.vcf.recode: This .vcf file contain 3660 SNPs sites from  77 individuals from 19 Pinus leiophylla locations. At each site, we randomly chose 2-5 adult trees. The locations are identified from A-U. Locations "O" and "P" were excluded.

We used PLINK v1.9 (Purcell et al., 2007) and vcftools v0.1.16 (Danecek et al., 2011) to retain diallelic SNPs that had minor allele frequencies (MAF) ≥ 0.025 and had less than 20% of missing data. We also removed SNPs that were in linkage disequilibrium (LD < 0.5) within a window size of 50 bp and a window shift of 5 (Purcell et al., 2007). We used HDplot (McKinney et al., 2017) to remove possible paralogs. We removed all regions that had observed heterozygosities (H*O) greater than 0.5 and a *D value outside the range of -15 and 15, which could indicate potentially duplicate loci due to deviation of allelic ratios expectation. After these filtering procedures we obtained the 3660 SNPs mentioned above.

Code/Software

Genetic_diversity.R: Script to obtainobtain the per SNP site and mean observed heterozygosity (*H*O), gene diversity (*H*S), the inbreeding coefficient (*F*IS) per locality, the number of private alleles and the genetic differentiation among localities using the Weir and Cockerham *F*ST

LFMM.R: Script to perform the LFMM analysis

GF.R: Script to perform the GF and genomic offsets analysis

Methods

Our sampling was conducted in the “Avocado Belt”, in the state of Michoacán, México. We sampled 77 individuals from 19 populations of Pinus leiophylla. Genomic libraries were created using the ddRAD-Seq method. 

Briefly, genomic DNA of each sample was digested using the restriction enzymes Pstl and MseI, and an adapter was ligated to the DNA fragments. The adapter-ligated fragments were sequenced using Illumina NovaSeq 6000. The sequenced samples were demultiplexed and trimmed for further analysis.

A de novo RAD reference genome was constructed using the individual that had the highest number of unique RAD sequences. Next, we used custom scripts to cluster identical sequences. The assembly for the reference individual was realigned against itself using BWA (Li & Durbin, 2009). BOWTIE (Langmead et al., 2009) was then used to align the reads of each individual to the RAD reference genome. SAMTOOLS (Li et al., 2009) and custom scripts were used to detect and filter SNPs that had a minimum sequencing depth per sample of 15x, individual per locus genotype quality scores of at least 20 and a minimum of 10x sequence coverage.

Subsequently, we used PLINK v1.9 (Purcell et al., 2007) and vcftools v0.1.16 (Danecek et al., 2011) to retain diallelic SNPs that were in Hardy-Weinberg equilibrium, had minor allele frequencies (MAF) ≥ 0.025 and had less than 20% of missing data. We also removed SNPs that were in linkage disequilibrium (LD < 0.5) within a window size of 50 bp and a window shift of 5 (Purcell et al., 2007). We used HDplot (McKinney et al., 2017) to remove possible paralogs. We removed all regions that had observed heterozygosities (Ho) greater than 0.5 and a D value outside the range of -15 and 15, which could indicate potentially duplicate loci due to deviation of allelic ratios expectation. After these filtering procedures we obtained 3660 SNPs.