Data from: Pervasive phylogenomic incongruence underlies evolutionary relationships in eyebrights (Euphrasia, Orobanchaceae)
Data files
Feb 10, 2025 version files 9.51 MB
-
ConservedScaffolds.tree
2.97 KB
-
mash_NJ.treefile
961 B
-
nrDNA_alignment.fasta
328.94 KB
-
nrDNA.tree
2.78 KB
-
Plastid_alignment.fasta
9.16 MB
-
Plastid.tree
2.92 KB
-
plotTangle.R
5.48 KB
-
README.md
2.12 KB
Abstract
Disentangling the phylogenetic relationships of taxonomically complex plant groups is often mired by challenges associated with recent speciation, hybridisation, complex mating systems and polyploidy. Here, we perform the first global phylogenomic analysis of eyebrights (Euphrasia), a group renowned for taxonomic complexity, with the aim of understanding the evolutionary processes underlying phylogenetic discordance. We generate whole genome sequencing data and integrate this with prior genomic data to perform a comprehensive analysis of nuclear genomic, nuclear ribosomal (nrDNA), and complete plastid genomes from 57 individuals representing 36 species sampled across the genus. The species tree analysis of 3454 conserved nuclear scaffolds (46 Mb) is structured by geography and ploidy, and partially by taxonomy, and indicates that post glacial colonisation of North Western Europe occurred in multiple waves from discrete source populations. However, most species are not monophyletic, and combine genomic variants from across clades. Comparative analyses confirm the close relationship between Northern Hemisphere allotetraploids and diploids, while Southern Hemisphere tetraploids include a distinct subgenome history indicative of independent polyploidy events. In contrast to the nuclear genome analyses, the plastid genome phylogeny reveals limited geographic structure, while the nrDNA phylogeny is informative of some geographic and taxonomic affinities but more thorough phylogenetic inference is impeded by the retention of ancestral polymorphim in tetraploids. Overall our results reveal extensive phylogenetic discordance at both deeper and shallower nodes, with broad-scale geographic structure of genomic variation but a lack of definitive taxonomic signal. This suggests that Euphrasia species either have polytopic origins, or are maintained by narrow genomic regions in the face of extensive homogenising gene flow. Moreover, these results suggest genome skimming will not be an effective extended barcode to identify species in groups such as Euphrasia or many other postglacial species groups.
https://doi.org/10.5061/dryad.jh9w0vtd3
Description of the data and file structure
Our phylogenomic analyses included a total of 58 samples, 56 samples from 36 Euphrasia species and two outgroup species, Bartsia alpina and Neobartsia chilensis. This included a combination of newly sequenced samples and previously generated data. Short-read Illumina data for each sample was used for de novo plastid genome assembly using Novoplasty or GetOrganelle, nuclear ribosomal DNA array assembly using Novoplasty, and nuclear genome analysis based on mapping to the Euphrasia arctica genome.
Files and variables
File: ConservedScaffolds.tree
Description: Sequence analysis of conserved nuclear genome scaffolds, used to make Figure 3 in the paper.
File: mash_NJ.treefile
Description: Neighbor-joining trees generated from raw sequence reads using MASH, used to make Figure S1 in the paper.
File: nrDNA.tree
Description: Phylogenetic relationships inferred from nuclear ribosomal DNA sequences for global Euphrasia species, used in Figure 2 in the paper.
File: Plastid.tree
Description: Phylogenetic relationships inferred from complete plastid genome sequences for global Euphrasia species, used in Figure 1 in the paper.
File: plotTangle.R
Description: Tanglegram comparing the species tree from the conserved nuclear scaffolds and the maximum likelihood phylogeny for the plastid genome used in Figure 5 in the paper.
File: nrDNA_alignment.fasta
Description: Alignment of nuclear ribosomal DNA sequences for global Euphrasia species, used in Figure 2 in the paper.
File: Plastid_alignment.fasta
Description: Alignment of complete plastid genome sequences for global Euphrasia species, used in Figure 2 in the paper.
Access information
Other publicly accessible locations of the data:
- n/a
Data was derived from the following sources:
- n/a
Our phylogenomic analyses included a total of 58 samples, 56 samples from 36 Euphrasia species and two outgroup species, Bartsia alpina and Neobartsia chilensis. This included a combination of newly sequenced samples and previously generated data. Short-read Illumina data for each sample was used for de novo plastid genome assembly using Novoplasty or GetOrganelle, nuclear ribosomal DNA array assembly using Novoplasty, and nuclear genome analysis based on mapping to the Euphrasia arctica genome.
- Garrett, Phen; Becher, Hannes; Gussarova, Galina et al. (2022). Pervasive Phylogenomic Incongruence Underlies Evolutionary Relationships in Eyebrights (Euphrasia, Orobanchaceae). Frontiers in Plant Science. https://doi.org/10.3389/fpls.2022.869583
