Skip to main content
Dryad logo

Data from: Population genomic diversity and structure at the discontinuous southern range of the Great Gray Owl in North America

Citation

Mendelsohn, Beth; Ernest, Holly (2020), Data from: Population genomic diversity and structure at the discontinuous southern range of the Great Gray Owl in North America, Dryad, Dataset, https://doi.org/10.5061/dryad.1rn8pk0qm

Abstract

Species' distributions are often discontinuous near the edge of the range where the environment may be more variable than the core of the range. Range discontinuity can reduce or cut off gene flow to small peripheral populations and lead to genetic drift and subsequent loss of genetic diversity. The southern extent of the Great Gray Owl ( Strixnebulosa) range in North America is discontinuous, unlike their northern core range across the boreal forests. We sampled owls from five different locations on the periphery of the range across the western US (Wyoming, Idaho, California, northern Oregon, and southern Oregon) to investigate genetic population structure and genetic diversity. Using a reduced-representation genomic sequencing approach to genotype 123 individuals at 4,817 single nucleotide polymorphic loci, we identified four genetically differentiated populations: California, southern Oregon, northern Oregon, and Wyoming and Idaho grouped together as a single Rocky Mountain population. The four genetically differentiated populations of Great Gray Owls identified in this study display high differentiation and low genetic variation, which is suggestive of long-term isolation and lack of connectivity, potentially caused by range discontinuity. The populations that lack habitat connectivity to the rest of the breeding range (i.e. those in California and Oregon) had lower genetic diversity than the Rocky Mountain population that is connected to the core of the range. These factors and other risks (such as disease and human-caused mortality) heighten susceptibility of these range-edge populations to future habitat and climate changes, genetic diversity erosion, and potential extinction vortex. For these reasons, protecting and monitoring this species on the southern edge of their range is vital.

Methods

See: Mendelsohn B, Bedrosian B, Love Stowell SM, Gagne RB, LaCava MEF, Godwin BL, Hull JM, and Ernest HB. (2020) Population genomic diversity and structure at the discontinuous southern range of the Great Gray Owl in North America. Conservation Genetics.

Usage Notes

File Descriptions:

individual_IDs.xlsx: assigned ID codes for each individual and the state the sample was collected

sample_information.xlsx: location data (approximate) where each sample was collected, date collected, USGS band number (if banded), age, sex, collector, and identification codes

variants_4817snps_123inds.vcf: Reads from each individual was mapped to the de novo reference.We filtered for sites with minimum base and mapping quality scores (Q-score) of 20, kept a max-depth of 100 reads per site per individual, and omitted insertions and deletions. We called biallelic SNPs with a Q-score of 20 or higher 25 resulting in 222,753 sites. These sites were thinned to one SNP per 136 bp sequence read. In addition, we removed SNPs with a minor allele frequency less than 0.05, missing data in more than 25% of individuals, or a minimum read depth per site per individual less than 3.

assembly_GG_rep_comb_0.95.fa: synthetic reference from our data created with the cd-hit-est package in CD-HIT to cluster sequences. We used the sequences from all 123 individuals and a 0.95 sequence identity threshold, resulting in 1,099,773 unique contigs

FST_function.R: function in R to calculate Hudson's FST from allele frequencies for population differentiation

PCA_function.R: function in R to generate Principal Component Analysis from genotype point estimates

parse_barcodes.pl: Script in perl to demultiplex pooled libraries into individual samples by unique barcodes. This script allows and corrects for one mismatch in the barcode and removes the adapter sequences from the reads, leaving only genomic DNA sequences.

splitfastq.pl: Script in perl to create a separate fastq file for each individual's reads

vcf2mpgl.pl: Script in perl to convert a vcf file to 'multiple population genotype likelihoods', outputting the probabilty that the individual at that site is a heterozygote and each of the homozygotes (3 likelihoods)

gl2genest.pl: Script in perl to convert genotype likelihoods to genotype point estimates, on a scale of 0-2. If all genotypes are equally likely, NA is returned

Funding

Wyoming Game and Fish Department State Wildlife Grant

Raptor Research Foundation

Meg and Burt Raynes Wildlife Fund

Wyoming Wildlife Foundation