Skip to main content

Pleistocene divergence in the absence of gene flow among populations of a viviparous reptile with intraspecific variation in sex determination

Cite this dataset

Hill, Peta; Wapstra, Erik; Ezaz, Tariq; Burridge, Christopher (2021). Pleistocene divergence in the absence of gene flow among populations of a viviparous reptile with intraspecific variation in sex determination [Dataset]. Dryad.


Polymorphisms can lead to speciation if there is differential mating success among conspecifics divergent for a trait. Polymorphism for sex determining system might be particularly expected to isolate gene pools, given strong selection for the production of viable males and females and the low success of heterogametic hybrids when sex chromosomes differ (Haldane’s rule). We investigated this question using a rare example of a species exhibiting polymorphism for sex determination: the viviparous snow skink Carinascincus ocellatus. While a coparatively high elevation population has entirely genotypic sex determination, in a lower elevation population there is an additional environmental component to sex determination. These systems also exhibit minor differences in sex-linked genotypes. Using ‘Isolation with Migration’ analysis of neutral loci, we estimated that these populations and their sex determining systems diverged in the absence of gene flow, across multiple periods of geographic proximity during Pleistocene glaciations. Our analysis suggests that populations of C. ocellatus with divergent sex determining systems are likely reproductively isolated, even though they are presently underlined by only subtle DNA differences. Given the influence of temperature on sex in one lineage, we also discuss the implications for the persistence of this polymorphism under climate change.


The mitochondrial sequences (NADH2 and NADH4) were obtained from Cliff et al. (2015). Neutral autosomal SNPs were derived from the dataset of Hill et al. (2018), obtained using a high-throughput sequencing approach (Kilian et al. 2012). All sex-linked SNPs from this dataset were removed for this analysis. Secondaries (additional SNPs on the same fragment) were removed from remaining loci using custom R script (R Development Core Team 2017); the SNP with the highest reproducibility and polymorphic information content from each locus was retained. SNP genotypes with an average reproducibility < 0.5, a call rate of < 0.9 and loci monomorphic within populations were also removed using the dartR package (Gruber and Georges 2019) in R. SNPs putatively under selection (Fst in the 5th percentile) and those not in Hardy-Weinberg Equilibrium (HWE; p < 0.05) in either population were filtered from the data using Genepop (Rousset 2008). From the remaining loci, a set of 100 SNPs were chosen at random; linkage amongst these loci was ruled out using the ‘genetics’ package (Warnes et al. 2013) in R. 

The level of gene flow accompanying divergence of GSD and GSD+EE populations, along with their duration of divergence, was assessed under the “Isolation with Migration” Bayesian framework of Hey and Nielsen (2004), employing IMa3 (Hey et al. 2018. Figure 2). Mitochondrial sequences and neutral nuclear SNPs were analysed concurrently to estimate lineage-splitting time and rates of gene flow between lineages in each direction. The HKY mutation model (Hasegawa et al. 1985) was employed for mtDNA sequence data, while the infinite sites model (Kimura 1969) was employed for nuclear SNPs. Uniform priors were employed for divergence time and population size parameters, while exponential priors were employed for gene flow, given an expectation that low rates of were likely (mean of prior distribution 6 x10-06, approximating one individual per generation). Upper limits on uniform priors were initially set broadly, and then based on inspection of posterior distributions, were narrowed in a subsequent run to encompass the range of this posterior plus a margin of error; overly-large priors reduce the precision of estimates given the use of a finite number of bins to represent the posterior distribution.

Isolation with Migration analysis was performed using Markov Chain Monte Carlo sampling with 112 chains distributed across 14 processors, and a geometric chain heating scheme with first and second heating parameters of 0.95 and 0.50, respectively. To reduce overall run-time, an initial analysis was run until stationarity of the sampling distribution was achieved, and this was then used to seed four simultaneous analyses, each run for 24-hours following a 10 min burnin and using unique random number seeds, to ensure independence among runs. In total, 111,643 genealogies were retained for estimation of model parameters.Information on mutation rate was employed to scale output into units of years (divergence time, gene flow).

Mitochondrial mutation rates were employed in the analysis, against which mutation rates at the nuclear loci would be scaled. We followed the rate estimate of 1.52% divergence per million years from Cliff et al. (2015). To account for potential variation in mutation rate (Ansari et al. 2019; Ho et al. 2005; Ho et al. 2007), we explored the consequences of using a faster rate of 2.3% divergence per million years, reported from Canary Islands skinks (Brown and Pestano 1998). Faster rates may be more applicable to reconstructing demographic history over recent (<100,000 yr) timescales (Burridge et al. 2008). Failure to entertain time-dependent rates of molecular change will lead to overestimation of divergence time and underestimation of gene flow (Burridge et al. 2008).


Equity Trustees

Australia and Pacific Science Foundation