Skip to main content

Data from: Population genomic diversity and structure at the discontinuous southern range of the Great Gray Owl in North America

Cite this dataset

Mendelsohn, Beth; Ernest, Holly (2020). Data from: Population genomic diversity and structure at the discontinuous southern range of the Great Gray Owl in North America [Dataset]. Dryad.


Species' distributions are often discontinuous near the edge of the range where the environment may be more variable than the core of the range. Range discontinuity can reduce or cut off gene flow to small peripheral populations and lead to genetic drift and subsequent loss of genetic diversity. The southern extent of the Great Gray Owl ( Strixnebulosa) range in North America is discontinuous, unlike their northern core range across the boreal forests. We sampled owls from five different locations on the periphery of the range across the western US (Wyoming, Idaho, California, northern Oregon, and southern Oregon) to investigate genetic population structure and genetic diversity. Using a reduced-representation genomic sequencing approach to genotype 123 individuals at 4,817 single nucleotide polymorphic loci, we identified four genetically differentiated populations: California, southern Oregon, northern Oregon, and Wyoming and Idaho grouped together as a single Rocky Mountain population. The four genetically differentiated populations of Great Gray Owls identified in this study display high differentiation and low genetic variation, which is suggestive of long-term isolation and lack of connectivity, potentially caused by range discontinuity. The populations that lack habitat connectivity to the rest of the breeding range (i.e. those in California and Oregon) had lower genetic diversity than the Rocky Mountain population that is connected to the core of the range. These factors and other risks (such as disease and human-caused mortality) heighten susceptibility of these range-edge populations to future habitat and climate changes, genetic diversity erosion, and potential extinction vortex. For these reasons, protecting and monitoring this species on the southern edge of their range is vital.


See: Mendelsohn B, Bedrosian B, Love Stowell SM, Gagne RB, LaCava MEF, Godwin BL, Hull JM, and Ernest HB. (2020) Population genomic diversity and structure at the discontinuous southern range of the Great Gray Owl in North America. Conservation Genetics.

Usage notes

File Descriptions:

individual_IDs.xlsx: assigned ID codes for each individual and the state the sample was collected

sample_information.xlsx: location data (approximate) where each sample was collected, date collected, USGS band number (if banded), age, sex, collector, and identification codes

variants_4817snps_123inds.vcf: Reads from each individual was mapped to the de novo reference.We filtered for sites with minimum base and mapping quality scores (Q-score) of 20, kept a max-depth of 100 reads per site per individual, and omitted insertions and deletions. We called biallelic SNPs with a Q-score of 20 or higher 25 resulting in 222,753 sites. These sites were thinned to one SNP per 136 bp sequence read. In addition, we removed SNPs with a minor allele frequency less than 0.05, missing data in more than 25% of individuals, or a minimum read depth per site per individual less than 3.

assembly_GG_rep_comb_0.95.fa: synthetic reference from our data created with the cd-hit-est package in CD-HIT to cluster sequences. We used the sequences from all 123 individuals and a 0.95 sequence identity threshold, resulting in 1,099,773 unique contigs

FST_function.R: function in R to calculate Hudson's FST from allele frequencies for population differentiation

PCA_function.R: function in R to generate Principal Component Analysis from genotype point estimates Script in perl to demultiplex pooled libraries into individual samples by unique barcodes. This script allows and corrects for one mismatch in the barcode and removes the adapter sequences from the reads, leaving only genomic DNA sequences. Script in perl to create a separate fastq file for each individual's reads Script in perl to convert a vcf file to 'multiple population genotype likelihoods', outputting the probabilty that the individual at that site is a heterozygote and each of the homozygotes (3 likelihoods) Script in perl to convert genotype likelihoods to genotype point estimates, on a scale of 0-2. If all genotypes are equally likely, NA is returned


Wyoming Game and Fish Department State Wildlife Grant

Raptor Research Foundation

Meg and Burt Raynes Wildlife Fund

Wyoming Wildlife Foundation