Allopolyploid origin and diversification of the Hawaiian endemic mints
Cite this dataset
Lindqvist, Charlotte (2024). Allopolyploid origin and diversification of the Hawaiian endemic mints [Dataset]. Dryad. https://doi.org/10.5061/dryad.ghx3ffbwc
Abstract
Island systems provide important contexts for studying processes underlying lineage migration, species diversification, and organismal extinction. The Hawaiian endemic mints (Lamiaceae family) are the second largest plant radiation on the isolated Hawaiian Islands. We generated a chromosome-scale reference genome for one Hawaiian species, Stenogyne calaminthoides, and resequenced 45 relatives, representing 34 species, to uncover the continental origins of this group and their subsequent diversification. We further resequenced 109 individuals of two Stenogyne species, and their purported hybrids, found high on the Mauna Kea volcano on the island of Hawai’i. The three distinct Hawaiian genera, Haplostachys, Phyllostegia, and Stenogyne, are nested inside a fourth genus, Stachys. We uncovered four independent polyploidy events within Stachys, including one allopolyploidy event underlying the Hawaiian mints and their direct western North American ancestors. While the Hawaiian taxa may have principally diversified by parapatry and drift in small and fragmented populations, localized admixture may have played an important role early in lineage diversification. Our genomic analyses provide a view into how organisms may have radiated on isolated island chains, settings that provided one of the principal natural laboratories for Darwin’s thinking about the evolutionary process.
README: Allopolyploid origin and diversification of the Hawaiian endemic mints
https://doi.org/10.5061/dryad.ghx3ffbwc
Description of the data and file structure
The data files comprise four main items:
1. Plastid DNA data.
Alignments and maximum likelihood phylogenetic trees of reference mapped and de novo assembled plastid DNA data. The below four files are compressed into a zipped file.
Description of files:
Plastids_DenovoAssembled_alignment.fa
Alignment of de novo assembled plastid genomes from 45 taxa of Hawaiian mints taxa and Stachys relatives in FASTA format.
Plastids_DenovoAssembled_RAxML_bipartitions.tre
RAxML bipartition file of phylogenetic analysis based on de novo assembled plastid genomes from 45 taxa of Hawaiian mints taxa and Stachys relatives in Newick tree format.
Plastids_RefMapped_alignment.fa
Alignment of plastid genomes based on reference mapped (using the Stenogyne calaminthoides reference genome) assemblies from 45 taxa of Hawaiian mints taxa and Stachys relatives in FASTA format.
Plastids_RefMapped_RAxML_bipartitions.tre
RAxML bipartition file of phylogenetic analysis based on reference mapped assemblies from 45 taxa of Hawaiian mints taxa and Stachys relatives in Newick tree format.
2. Reference mapped SNP variant data sets.
The ten datasets used in various analyses in this paper comprising different sets of samples and filtration of single nucleotide polymorphisms (SNP) variant data based on resequenced individuals mapped to the Stenogyne calaminthoides reference genome (see text and Supplementary Data 6 in publication for details). Each file is either in vcftools or bcftools format and GZ zipped.
Description of files:
- DatasetDS4: contains all samples of Hawaiian mints taxa and Stachys relatives (N=45)
- DatasetDS4a: contains Western NA Stachys + Hawaiian mints (N=36)
- DatasetDS4b: contains Hawaiian mints (N=30)
- DatasetDS4c: contains all samples except Stachys byzantina (N=43)
- DatasetDS7: contains all samples of Hawaiian mints taxa and Stachys relatives (N=45)
- DatasetDS10: contains all samples of Hawaiian mints taxa and Stachys relatives (N=45)
- DatasetDS10a: contains Western NA Stachys + Hawaiian mints (N=36)
- DatasetDSHM1: contains Stenogyne rugosa/microphylla samples (N=113)
- DatasetDSHM2: contains all Stenogyne samples (N=127)
- DatasetDSHM4: contains all Stenogyne samples (N=127)
3. MaSuRCA assemblies
De novo MaSuRCA assemblies of Stenogyne calaminthoides reference and all 45 Illumina resequenced Hawaiian mints taxa and Stachys relatives.
Description of files:
Each assembly file (total 46) is in FASTA format and is zipped and named according to the specific sample (project ID number and taxon name). See text and Supplementary Data 8 in the publication for details.
4. BUSCO data
Alignments of BUSCO genes (see text and Supplementary Data 8 in publication for details).
Description of files:
BUSCO_alignments.tar.gz
File contains BUSCO gene alignments in GZ zipped format and following the scheme: #BUSCO id followed by an alignment, with alignments from each BUSCO separated by empty newlines.
Sharing/Access information
Links to other publicly accessible locations of the data:
- The genome data generated in this study have been deposited in the NCBI database under BioProject accession code PRJNA924716 and BioSample ID SAMN32782865: For the reference genome of Stenogyne calaminthoides, Hi-C reads are under accession SRR23341345, RNA-seq data under accession SRR23341344, Illumina shotgun data under accession SRR23341343, and Oxford Nanopore reads under accession SRR23341342. This Whole Genome Shotgun project has been deposited at GenBank under the accession JBBCBC000000000. Raw reads for resequenced samples can be found under accession numbers SAMN32767766-SAMN32767919.
- The Stenogyne calaminthoides genome assembly and annotation used for analyses in this study is available on CoGe.
Methods
We generated a genome assembly of Stenogyne calaminthoides using Oxford Nanopore Technology long read sequencing, Hi-C scaffolding, as well as gene model predictions based on transcriptome assembly and homology-based gene predictors. Illumina resequencing was performed from 45 taxa of Hawaiian mints taxa and Stachys relatives (~30 Gb per sample), in addition to 110 samples of Stenogyne rugosa and S. microphylla and their purported hybrids (~15 Gb per sample).
The data and our analyses were sued to (i) generate a high-quality, chromosome-level genome of Stenogyne calaminthoides to investigate polyploid history of the Hawaiian mint lineage, (ii) use this reference genome and resequencing of multiple species to establish the origin and phylogeny of the Hawaiian mints and their mainland relatives, as well as potential admixture history, and (iii) investigate recent introgression among Stenogyne species that co-occur at high elevation on Mauna Kea.
Funding
Nanyang Technological University
National Science Foundation