Molecular phylogeny and morphological perianth evolution in Corymbia (Myrtaceae), and the implications for generic delimitation: data and tree files
Data files
Dec 18, 2023 version files 51.82 MB
-
101concat-29May.nex.run1.best_scheme.nex
5.46 KB
-
101concat-29May.nex.run1.treefile.tre
28.42 KB
-
101concat-29May.phy.gz.run2.treefile.tre
28.44 KB
-
concatenated_sequences_dataset_101_loci.phy
51.63 MB
-
README.md
1.01 KB
-
Traits_dataset_Mesquite.nex
127.29 KB
Abstract
Premise: Eucalypts (Myrtaceae tribe Eucalypteae) are currently placed in seven genera. Traditionally, Eucalyptus was defined by its operculum but when phylogenies placed Angophora, with free sepals and petals, as sister to the operculate bloodwood eucalypts, the latter were segregated into a new genus, Corymbia. Yet generic delimitation in the tribe Eucalypteae remains uncertain. Here we address these problems using phylogenetic analysis with the largest molecular dataset to date.
Methods: We captured 101 low-copy nuclear exons from 392 samples representing 266 species. Our phylogenetic analysis used maximum likelihood (IQtree) and multi-species coalescent (Astral). At two nodes critical to generic delimitation, we tested alternative relationships among Arillastrum, Angophora, Eucalyptus and Corymbia using Shimodaira's AU test. Phylogenetic mapping was used to explore the evolution of perianth traits.
Results: Monophyly of Corymbia relative to Angophora was decisively rejected. All alternative relationships among the seven currently recognised Eucalypteae genera imply homoplasy in evolutionary origins of the operculum. Inferred evolutionary transitions in perianth traits are congruent with divergences between major clades except that expression of separate sepals and petals in Angophora, which is nested within the operculate genus Corymbia, appears to be a reversal to the plesiomorphic perianth structure.
Conclusions: Here we formally raise Corymbia subg. Blakella to genus rank and make the relevant new combinations. We also define and name three sections within Blakella (B. sect. Blakella, B. sect. Naviculares and B. sect. Maculatae), and two series within Blakella sect. Maculatae (B. ser. Maculatae and B. ser. Torellianae). Corymbia is reduced to the red bloodwoods.
README: Perianth Evolution and Implications for Generic Delimitation in the Eucalypts (Myrtaceae): data and tree files
Description of files for Dryad:
<concatenated_sequences_dataset_101_loci.phy>
Aligned sequences from targeted capture of 101 low-copy nuclear exons from 392 samples representing 329 species-level eucalypt taxa. This is a Phylip-formatted file for input to IQtree for phylogenetic analysis.<101concat-29May.nex.run1.best_scheme.nex>
This file defines 101 boundaries of loci in .phy file, with a separate model for each locus, input to IQtree<101concat-29May.nex.run1.treefile.tre>
Tree from IQtree run1 analysis using data files 1 and 2 with 101 partitions (= individual nuclear loci).<101concat-29May.phy.gz.run2.treefile.tre>
Tree from IQtree run2 analysis using all loci combined into a single partition<Traits_dataset_Mesquite.nex>
Mesquite dataset with floral traits mapped on IQtree from run2, with a single partition (tree 4 above)
Methods
Sampling—Hereafter, the term "eucalypts s.l." refers to the clade comprising Angophora, Corymbia and Eucalyptus. "Corymbia s.l." refers to the Corymbia clade, comprising C. subg. Corymbia, C. subg. Blakella and Angophora (Fig. 2). Sampling represented all genera within Myrtaceae tribe Eucalypteae sensu Wilson et al. (2005) except Allosyncarpia, Eucalyptopsis and Stockwellia. Within the eucalypts, both subgenera of Corymbia (Parra-O et al., 2009) and six of eight subgenera of Eucalyptus—except Acerosae (E. curtisii) and Alveolata (E. microcorys), both of which are monotypic—were sampled. There were 392 species-level taxa, including eleven outgroups, which were selected based on whole-of-family phylogenies by Wilson et al. (2005) and Thornhill et al. (2012, 2015) and included species of Osbornia and Melaleuca (tribe Melaleuceae), Backhousia (tribe Backhousieae), Tristaniopsis (tribe Kanieae), Syncarpia (tribe Syncarpieae) and Arillastrum (tribe Eucalypteae). The tree was rooted between Melaleuceae and the rest, based on earlier studies (Wilson et al., 2005; Thornhill et al., 2015).
Most samples were field-collected leaf tissue with vouchers lodged in the Australian National Herbarium (CANB), where the identifications were verified by Andrew Slee. These were supplemented by leaf samples taken directly from CANB herbarium sheets, with permission. Samples from Currency Creek Arboretum (Currency Creek, South Australia) and the Australian National Botanic Gardens (Canberra) were taken with permission from vouchered living trees (details in Thornhill et al., 2015). All taxa and accessions sampled are listed in Suppinfo S1 and nomenclature follows Slee et al. (2020), which largely follows Brooker (2000) for Eucalyptus and Hill & Johnson (1995), for Corymbia. In case of conflicts between authorities, we follow the Australian Plant Census (APC).
Target capture and sequencing—As data for this study were acquired prior to the angiosperm 353 capture set (Johnson et al., 2019) becoming available, we used a target-capture approach aimed at identifying and sequencing up to 200 orthologous low-copy loci from the nuclear genome with potential to resolve species-level relationships across the large family Myrtaceae, as per Choi et al. (2019).
In plates of 48 samples, the pooled DNA library for each specimen was hybridised to the target probes using the SeqCap EZ Developer Library (NimbleGen, Madison, Wisconsin, USA) following the manufacturer’s instructions with minor modifications detailed in Choi et al. (2019). Recovery and wash of hybridised samples were carried out using the SeqCap Hybridisation and Wash Kit (NimbleGen, Mannheim, Germany) following the manufacturer’s instructions. After indexing-PCR and purification, the captured libraries were sequenced on the Illumina Miseq platform (one pool of 48 samples) and the HiSeq2000 (all other pools) platform (100 bp paired-end read protocol) at the Bio-molecular Research Facilities at The Australian National University.
Data handling and mapping of reads—The quality of the raw reads was investigated using FastQC (Andrews, 2010) (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). BBduk within BBTools was used to remove Illumina adapters, low-quality reads and sequences using standard parameters (trimq = 30, minlen = 40, ktrim = r, hdist = 1, tpe tbo; http://jgi.gov/data-and-tools/bb-tools/). The cleaned reads were rechecked using FastQC. After read cleaning, axe demultiplexer was used to sort reads by barcode using standard parameters (https://manpages.debian.org/testing/axe-demultiplexer/axe-demux.1.en.html). The reads were mapped against the E. grandis targets using bwa-mem (Li & Durbin, 2009). Sam files were converted to bam, sorted and indexed using samtools v1.3.1 (Li et al., 2009). Picard was used to remove duplicates (http://broadinstitute.github.io/picard/). Finally, Platypus was used to call variants with standard parameters (Rimmer et al., 2014).
Sequence alignment and editing—Sequences were imported into Geneious Prime ver. 2020.1.2 (Biomatters Ltd) for assembly, alignment and editing. Initially, each locus was aligned separately across all samples using MAFFT ver. 1.4.0 (Biomatters Ltd). After trimming, alignments were adjusted by eye. This included deleting sites with > 95% missing data. A Neighbor-Joining tree was generated for each locus and inspected for anomalies, such as likely chimeric sequences indicated by long, often misplaced branches. Every locus was assessed for paralogy (multiple gene copies) as indicated by concerted sharing of polymorphisms among distantly related taxa, and such loci were excluded. Randomly scattered polymorphic base calls were assumed to indicate allelic variation and such loci were retained. Ninety-nine of the 200 targeted genes were discarded, leaving 101 putatively single-copy genes. These were concatenated using Geneious. All samples with > 60% of concatenated sequence missing were culled, leaving 392 of 521 of the original eucalypt + outgroup sequences in the final alignment, which totalled 129,354 base pairs, comprising 27,100 parsimony-informative sites, 14,807 singleton sites and 87,447 constant sites. The final set of 101 loci are listed in Suppinfo S2, identified by their labels in the annotated Eucalyptus grandis genome (Myburg et al., 2014).
Phylogenetic analysis—Phylogenies were first estimated from the concatenated sequences of all 101 nuclear loci, initially treated as a single partition, using maximum likelihood (ML) as implemented in RAxML ver. 8.2.12 (Stamatakis, 2014) on the CIPRES Science Gateway (Miller et al., 2010) with a GTR + G model. Additionally, ML analyses were run using IQtree ver. 1.6.10 (Nguyen et al., 2015), first with a single partition and then with the DNA divided into 101 partitions (loci: Suppinfo S2), each with its own model (Chernomor et al., 2016), estimated using ModelFinder (Kalyaanamoorthy et al., 2017). Node support was estimated using Ultrafast bootstrap (UFB) with 1000 replicates (Minh et al., 2013; Hoang et al., 2018), as well as site (sCF) and gene (gCF) concordance factors (Minh et al., 2020). Site concordance is measured using 100 randomly resampled quartets: for an internal branch in a rooted tree, there are three possible quartets, so that the null expectation is 33% of sites. Gene concordance measures the proportion (%) of genes supporting the branch. In a different approach, we estimated the species tree using the 101 individual gene trees from IQtree with the multi-species coalescent model implemented in Astral-II (Mirarab et al., 2015). In an Astral tree, internal branch lengths are in coalescent units and branch support is estimated by local posterior probability (LPP), which is the probability that this is the true branch given the set of gene trees (computed based on the quartet score and assuming incomplete lineage sorting).
Approximately Unbiased tests of alternative topologies—Using each IQtree (single- and multi-partition) in turn, we tested the alternative relationships among Arillastrum and the eucalypt genera (Angophora, Corymbia and Eucalyptus) with the Approximately Unbiased (AU) test of Shimodaira (2002) using 1000 replicates. The first question was whether some of the eucalypt genera are more closely related to Arillastrum than to the rest of the eucalypts; in other words, could we rule out that Arillastrum is nested within the eucalypt clade? The second question asked whether Corymbia is monophyletic, particularly in relation to Angophora, given the contrasting findings in earlier studies, reviewed above.
Mapping of perianth traits—The IQtrees with branch lengths were imported to Mesquite ver. 3.61 (Maddison et al., 2019) for trait mapping and hypothesis testing. Relevant trait data from Euclid edition 4 (Slee et al., 2020) were also imported to Mesquite. We defined the following three characters for testing hypotheses about perianth evolution.
Character 1, compound petals (Drinnan & Ladiges, 1988)—1.1: petals simple (Figure 1a). 1.2: petals compound with keel and limb differentiation (Figure 1b). The developmental studies by Drinnan and Ladiges led to their recognition of a homologous keel-and-limb petal structure throughout Angophora and Corymbia. However, interpretation of this character is unclear in Eucalyptus. Drinnan & Ladiges (1989a, c) considered the four adaxial buttresses of the petals in E. subg. Eudesmia, on which the stamens develop, to be homologous with the keel component of the compound petals in Angophora and Corymbia. In other subgenera of Eucalyptus (Drinnan & Ladiges, 1989b, c; Drinnan & Ladiges, 1991b), these authors described the circumfloral buttress that bears the stamens ("stemonophore") as arising between the corolla and gynoecium. Later (Ladiges et al., 1995), they coded the stamen-bearing buttress as a separate (non-homologous) character in eudesmids (E. subg. Eudesmia) in contrast to the stemonophore of E. subg. Eucalyptus and E. subg. Symphyomyrtus. Hill & Johnson (1995) demurred, suggesting a single origin of the stemonophore, followed by losses (reversals) in some eucalypt lineages. Given these differing interpretations, we scored this character as uncertain (?) in all species of Eucalyptus except eudesmids.
Character 2, corolla type—2.1: Free petals (Figure 1a,b). 2.2: Single operculum lacking discernable petaline parts. 2.3: Single operculum of connate, discernable petals with sutures and/or free tips (Figure 1d,f). 2.4: Forming a double operculum with the calyx (Figure 1e,g). 2.5: Petals discernable inside and shed with the sepaline operculum.
In some species of Corymbia sensu stricto (= Corymbia B, sensu Steane et al. 2002), separate petals are reported to be discernable to varying degrees in early development of the bud, e.g. C. arnhemensis, C. calophylla, C. ficifolia, C. gummifera, C. intermedia, C. jacobsiana, C. ptychocarpa and C. trachyphloia (Willis, 1951; Drinnan & Ladiges, 1988; Hill & Johnson, 1995) but are retained inside and adherent to the underside of the sepaline operculum until anthesis (Hill & Johnson, 1995). We examined buds at anthesis of four of these species and it was clear that the separate petals had fused fully with the sepaline operculum. Nevertheless, these species are scored here as having a separate character state (2.5) for this trait.
All eudesmids (E. subg. Eudesmia) have a single (petaline) operculum (character state 2.2), but in some species, the sepals are attached to the apex of the operculum (character state 3.2, Figure 1c). However, the sepals remain separate from each other and no eudesmid forms a sepaline operculum as such (Gibbs et al., 2009).
Character 3, calyx type—3.1: Free sepals, attached to hypanthium rim (Figure 1a,b,d,f). 3.2: Free sepals, attached to apex of petaline operculum (Figure 1c). 3.3: Sepaline operculum, shed before petaline operculum or as a separate structure from petaline (inner) operculum at anthesis (Figure 1e). 3.4: Sepals fused with petals into a double operculum, which is evident from the absence of a scar prior to anthesis, when both opercula are shed as a unit (Figure 1g). 3.5: Calyx nil at any stage of floral development (E. subg. Eucalyptus only).
The calyx was assumed to be absent in E. subg. Eucalyptus based on a comparative developmental study of species from this subgenus and E. cloeziana (E. subg. Idiogenes), which was considered closely related (Drinnan & Ladiges, 1989b). Hill & Johnson (1995) disagreed suggesting, on the basis of their unpublished observations of operculum development in these taxa, that the operculum in E. subg. Eucalyptus is a composite of calyx and corolla, similar to that of E. sect. Miniatae. In these eudesmids, separate sepals are visible at the apex of the mature petaline operculum (Figure 1c). However, the operculum of E. subg. Eucalyptus shows no evidence of sepals at any stage of development (Drinnan & Ladiges, 1989b). Therefore, we scored the corolla type of E. subg. Eucalyptus as a single petaline operculum (character state 2.2) and the calyx type as nil (character state 3.5).
The following manipulations were made to the trees to explore the effects of alternative topologies on the inferred evolution of perianth traits: (1) alternative placement of Arillastrum as sister to Corymbia + Angophora, which was not rejected by the AU test; and (2) alternative placement of Angophora as sister to Corymbia subg. Blakella, which was not decisively ruled out by the AU test. Models used for trait reconstruction were parsimony (unordered coding) and Mk1 (Asymmetrical Markov k-state 2 parameter model: Lewis, 2001; Maddison & Maddison, 2019).