Phylogenomics of the tetraploid Hawaiian lobeliads: Implications for their origin, dispersal history, and adaptive radiation
Data files
Apr 18, 2025 version files 78.51 MB
-
BEAST-lobeliads-nuclear.zip
1.23 MB
-
BEAST-lobeliads-plastome.zip
203.55 KB
-
BioGeoBEARS-lobeliads.zip
17.92 KB
-
HybPiper-lobelaids.zip
11.68 MB
-
HyDe-lobeliads.zip
41.82 MB
-
PhyloNetworks-lobeliads.zip
683.95 KB
-
README.md
10.81 KB
-
sequences-lobeliads.zip
22.83 MB
-
TraitData.zip
18.95 KB
Abstract
Hawaiian lobeliads exhibit extensive adaptive radiations and are considered the largest plant clade (143 species) endemic to any oceanic archipelago. Rapid insular radiations are prone to reticulate evolution, yet detecting hybridization has been limited by species sampling or inadequate nuclear data in previous Hawaiian studies. We analyzed 633 nuclear loci (including tetraploid duplications) and whole plastomes for 89% of extant species to derive phylogenies for the Hawaiian lobeliads. Nuclear data provide strong support for nine major clades in both likelihood and ASTRAL analyses. All genera/sections are monophyletic except Clermontia and Cyanea. Nuclear and plastome phylogenies conflict on short, deep branches; the nuclear tree resolves a fleshy-fruited clade of Hawaiian Clermontia/Cyanea-Brighamia/Delissea, sister to Polynesian Sclerotheca, with both sister to a capsular-fruited Hawaiian clade. Incomplete lineage sorting in a rapid radiation starting 8.5-11.3 million years ago is sufficient to explain uncertainty and cytonuclear discordance along the backbone; sequence data strongly supports reticulation within Clermontia and especially Cyanea. Nuclear data identify 42 inter-island dispersal events, of which 89% accord with the progression rule, involving movement to the next younger, formerly unoccupied islands in the hotspot chain, consistent with ecological theory; plastid data overestimate such dispersals by 17%. Clermontia and Cyanea have undergone parallel adaptive radiations in elevational distribution and flower length on all major islands, but with some inter-island divergence. Within-island adaptive radiation and ecological speciation in these traits within Clemontia/Cyanea, combined with widespread single-island endemism, frequent inter-island dispersal, and occasional hybridization drove Hawaiian lobeliad diversification, together with early intergeneric divergence in habitat.
This Dryad repository accompanies the paper:
Phylogenomics of the tetraploid Hawaiian lobeliads: Implications for their origin, dispersal history, and adaptive radiation
Published in PNAS by:
Jeffrey P. Rose, Bing Li, Margaret J. Sporck-Koehler,
Elizabeth A. Stacy, Kenneth R. Wood,Emily Moriarty Lemmon,
Alan R. Lemmon, Cécile Ané, Kenneth J. Sytsma, Thomas J. Givnish
Questions about this repository should be directed toward author J.P. Rose (jrose3@wisc.edu).
This repository contains the necessary inputs and outputs of all molecular and trait analyses used in this study.
The structure of archives is broken down to contain the files needed to reproduce individual analyses.
This package contains the following archives:
- HybPiper-lobeliads
- sequences-lobeliads
- BEAST-lobeliads-nuclear
- BEAST-lobeliads-plastome
- BioGeoBEARS-lobeliads
- PhyloNetworks-lobeliads
- HyDe-lobeliads
- TraitData
Detailed descriptions of each archive are listed in sequential order below.
#################################################
1. "HybPiper-lobeliads.zip"
#################################################
This archive contains the consensus sequences used to reassemble loci. It also contains the homolog alignments output from HybPiper and the resulting gene trees, used as input for OrthoSNAP.
The file structure within this archive is as such:
HybPiper-lobeliads
|__homolog.alignments
|__homolog.genetrees
|__lobeliad_targets_majorityrule.fasta #majority rule consensus sequences for all homologs used as input for HybPiper
- homolog.alignments alignments of all homologs
- homolog.genetrees genetrees for all homologs
- lobeliad_targets_majorityrule.fasta majority rule consensus sequences for all homologs used as input for HybPiper
#################################################
"sequences-lobeliads.zip"
#################################################
The molecular data used in this study. The archive is broken down into:
sequences-lobeliads
|__nuclear
|__nuclear.alignments
|__nuclear.genetrees.final
|__nuclear.IQTREE.tre
|__nuclear.ASTRAL.tre
|__nuclear.ASTRAL.subs.tre
|__plastid
|__plastomes.EXONS
|__full.plastomes.fasta
|__plastome.EXONS.constrained.tre
|__plastome.EXONS.unconstrained.tre
|__full.plastomes.tre
|__simulated.plastid.trees
|__plastomes.bipartition.nuclear.tre
|__plastome.scored.ASTRAL.tre
sequences-lobeliads/nuclear: nuclear alignments and trees.
- "nuclear.alignments.zip" archive containing final nuclear alignments. Nomenclature is original assembly locus followed by orthogroup number (0-n). Example L99.0 is locus 99 orthogroup 1, L99.1 s locus 99 orthogroup 2, etc.
- "nuclear.genetrees.final.zip" archive containing final nuclear genetrees. Nomenclature is original assembly locus followed by orthogroup number (0-n). Example L99.0 is locus 99 orthogroup 1, L99.1 s locus 99 orthogrouop 2, etc.
- "nuclear.IQTREE.tre": ML nuclear tree from IQ-TREE.
- "nuclear.ASTRAL.tre": ASTRAL nuclear tree with branches in coalescent units and support as local posterior probabilities.
- "nuclear.ASTRAL.subs.tre": ASTRAL nuclear tree with branches in substitutions per site.
sequences-lobeliads/plastid: plastid alignments and trees.
- "plastomes.EXONS.zip": archive containing final alignments of plastome exons.
- "full.plastomes.fasta": final alignment of complete plastomes.
- "plastome.EXONS.constrained.tre" & "plastome.EXONS.unconstrained.tre": ML plastome exon tree from IQ-TREE constrained/unconstrained.
- "full.plastomes.tre": ML full plastome tree from IQ-TREE.
- "simulated.plastid.trees": 5,000 trees simulated under ILS.
- "plastome.bipartition.nuclear.tre": Bipartition support of simulated trees on the IQ-TREE ML tree.
- "plastome.scored.ASTRAL.tre": ASTRAL tree scored using simulated trees as input gene trees.
#######################################################
2. "BEAST-lobeliads-nuclear.zip"
#######################################################
This contains the input/output for the BEAST nuclear analysis. The archive is broken down into:
BEAST-lobeliads-nuclear
|__BEAST.nuclear.loci.concatenated.nex
|__Lobeliads.nuclear.BDCONST.xml
|__Lobeliads.BD.on.ASTRAL.target.tre
|__Lobeliads.BD.on.backbone.tre
- "BEAST.nuclear.loci.concatenated.nex": input concatenated nucleotide alignment.
- "Lobeliads.nuclear.BDCONST.xml": BEAUti file.
- "Lobeliads.BD.on.ASTRAL.target.tre"/"Lobeliads.BD.on.backbone.tre": Resulting MCC trees for the posterior sample summarized on the ASTRAL topology and posterior sample summarized with a constraint on relationships among major clades only, respectively.
#########################################################
3. "BEAST-lobeliads-plastome.zip": input/output for the BEAST plastome analysis.
#########################################################
This contains the input/output for the BEAST plastome analysis.The archive is broken down into:
BEAST-lobeliads-plastome
|__Lobelaids.plastome.BD.xml
|__Lobeliads.plastome.BD.tre
- "Lobelaids.plastome.BD.xml": BEAUti file.
- "Lobeliads.plastome.BD.tre": Resulting MCC tree.
#################################################
4. "BioGeoBEARS-lobeliads.zip"
#################################################
Input files for the BioGeoBEARS nuclear/plastome analyses. Directory structure is as follows:
BioGeoBEARS-lobeliads
|__areas_allowed.txt
|__BioGeoBEARS.nuclear.tre
|__BioGeoBEARS.nuclear.txt
|__BioGeoBEARS.plastid.tre
|__BioGeoBEARS.plastid.txt
|__manual_dispersal_multipliers.txt
|__timeperiods.txt
- "areas_allowed.txt": areas allowed to be occupied at each time period for the stratified analysis."BioGeoBEARS.nuclear.tre": nuclear tree.
- "BioGeoBEARS.nuclear.txt": input of ranges for the nuclear tree.
- "BioGeoBEARS.plastid.tre": plastid tree.
- "BioGeoBEARS.plastid.txt": input of ranges for the plastid tree.
- "manual_dispersal_multipliers.txt": dispersal probabilities among areas broken down by ach time period for the stratified analysis.
- "timeperiods.txt": file containing the breakpoints of each time period for the stratified analysis.
#################################################
5. "PhyloNetworks-lobeliads.zip"
#################################################
This contains the input/output for PhyloNetworks analysis. The archive is broken down into:
PhyloNetworks-lobeliads
|__trees.set*.tre
|__table.CF.set*.txt
|__net*.set*.log
|__net*.set*.out
|__net*.set.networks
Nomencalture:
For .tre and .txt files, wildcard numbers are 1-10 and correspond to each of 10 sets of randomly pruned tips.
For .log, .out, and .networks files, the first wildcard number is the maximum number of hybridization events (hmax) and ranges from 0-2, while the second is the set number described above.
*.tre files are input gene trees for each of 10 sets of randomly pruned tips.
*.txt files are input concordance factors for each of 10 sets of randomly pruned tips.
*.log files are the log files from PhyloNetworks for each value of hmax
*.out files contain the best network among all runs for each value of hmax as well as the best network per run.
*.networks files contain the the networks obtained by switching the hybrid node inside each cycle among all runs for each value of hmax contains.
#################################################
6. "HyDe-lobeliads.zip"
#################################################
Contains the input and output (filtered triplets) from the HyDe analysis.
HyDe-lobeliads
|__Lobeliads-res-out-filtered.txt
|__Lobeliads-map.txt
|__Lobeliads-data.txt
- "Lobeliads-res-out-filtered.txt": output file listing all pairs of significant triplets.
- "Lobeliads-map.txt": mapping file between samples in the alignment and species assignment.
- "Lobeliads-data.txt": input alignment.
#################################################
7. "TraitData.zip"
#################################################
Contains files necessary to reproduce the trait/adaptive radiation analyses.
IMPORTANT. Reproducing the trait analysis also requires downloading the Zenodo software package accompanying this Dryad repository.
TraitData
|__GBIF.species.csv
|__Lobeliad.tree.names.xlsx
|__Lobeliads.master.xlsx
- "GBIF.species.csv": spreadsheet key to standardize names in the occurrence data. Column headers are: "species": the verbatim GBIF name, "status": if the species is to be kept ("keep") or removed ("REMOVE"), and "correct_name": the standardized species name.
- "Lobeliad.tree.names.csv": spreadsheet to standardize names in the phylogeny. Column headers are: "taxon": the taxon name of the individual submitted for sequencing, "species": the species name of the individual submitted for sequencing, "volcano": the volcano from which the individual is from (if known), "island": the island from which the individual is from under all 7 high island model, "island4": the island from which the individual is from under a 4 high island model, "keep.comp": if the tip is retained in comparative analyses (x) or dropped (NA), "ID": the unique sequencing ID for the individual, and "full.tip.name": the concatenation of "taxon" and "ID" fields.
- "Lobeliads.master.csv": spreadsheet containing trait and geographic data for each taxon x island combination. Column headers are: "genus": the clade to which the taxon belongs, "species": the taxon name, "Hawai'i": presence (1) or absence (0) on Hawai'i, "Maui": presence (1) or absence (0) on Maui, "O'ahu": presence (1) or absence (0) on O'ahu, "Kaua'i": presence (1) or absence (0) on Kaua'i, "Moloka'i": presence (1) or absence (0) on Moloka'i, "Lana'i": presence (1) or absence (0) on Lana'i, "Ni'ihau": presence (1) or absence (0) on Ni'ihau, "Maui_Nui": presence (1) or absence (0) on the Maui Nui complex, "Kaua'i2": presence (1) or absence (0) on Kaua'i + Ni'ihau, "7_island_sum": total number of islands occupied under a 7 high island model, "4_island_sum": total number of islands occupied under a 4 high island model, "Island": verbatim island complex under a 4 high island model, "elev.min": minimum elevation (in m) occupied by the taxon, "elev.max": maximum elevation (in m) occupied by the taxon, "perianth_length_mm": maximum perianth length (in mm). "Source": reference for geographical and trait data (see manuscript). "Notes": any taxonomic notes. NA values are missing data for continuous variables: these are either unknown for the taxon or could not be scored (Sclerotheca only).
