Skip to main content

Whole genome analyses disentangle reticulate evolution of primroses in a biodiversity hotspot

Cite this dataset

Stubbs, Rebecca; Conti, Elena; Theodoridis, Spyros (2022). Whole genome analyses disentangle reticulate evolution of primroses in a biodiversity hotspot [Dataset]. Dryad.


1. Biodiversity hotspots, such as the Caucasus mountains, provide unprecedented opportunities for understanding the evolutionary processes that shape species diversity and richness. Therefore, we investigated the evolution of Primula sect. Primula, a clade with a high degree of endemism in the Caucasus.

2. We performed phylogenetic and network analyses of whole-genome resequencing data from the entire nuclear genome, the entire chloroplast genome, and the entire heterostyly supergene. The different characteristics of the genomic partitions and the resulting phylogenetic incongruences enabled us to disentangle evolutionary histories resulting from tokogenetic versus cladogenetic processes. We provide the first phylogeny inferred from the heterostyly supergene that includes all species of Primula sect. Primula.

3. Our results identified recurrent admixture at deep nodes between lineages in the Caucasus as the cause of non-monophyly in Primula. Biogeographic analyses support the “out-of-the-Caucasus” hypothesis, emphasizing the importance of this hotspot as a cradle for biodiversity.

4. Our findings provide novel insights into causal processes of phylogenetic discordance, demonstrating that genome-wide analyses from partitions with contrasting genetic characteristics and broad geographic sampling are crucial for disentangling the diversification of species-rich clades in biodiversity hotspots.


Tree files and alignments (fastas) for Primula sect. Primula phylogeny. Includes complete nuclear, plastid, and s-locus alignments.

All fastas were created from VCFs that were filtered. Variant calling was performed by BCFtools v.1.8 multiallelic caller and set to output only variant site. Repetitive sequence regions were masked based on the annotation from Potente et al. (2022). Filtering was performed using VCFtools v.0.1.14 and set to remove indels and include only single nucleotide polymorphisms (SNPs) occurring in at least 80% of individuals, with a minimum site quality of 30, and a mean site depth between 5x and 60x. All fastas were output with heterozygous sites randomly resolved. 

For information about phylogenies, please see the manuscript. 


Swiss National Science Foundation