Supplemental data for: Lineage diversification and rampant hybridization among subspecies explain taxonomic confusion in the endemic Hawaiian fern Polypodium pellucidum
Data files
Jan 31, 2024 version files 1.72 GB
-
AppendixS1_Admixture.zip
-
AppendixS3_HybridDatasets.zip
-
AppendixS4_PrecombinedDatasetD.zip
-
AppendixS5_RawMorph.csv
-
AppendixS6_EcoGeo.csv
-
AppendixS7_MSCDatasetB.zip
-
AppendixS8_SporeLength.zip
-
Appendixs9_SporeSEMs.zip
-
README.md
May 17, 2024 version files 1.72 GB
-
AppendixS1_Admixture.zip
-
AppendixS2_HybridResults.csv
-
AppendixS3_PrecombinedDatasetD.zip
-
AppendixS4_RawMorph.csv
-
AppendixS5_EcoGeo.csv
-
AppendixS6_ReadStatistics.csv
-
AppendixS7_MSCDatasetB.zip
-
AppendixS8_SporeLength.zip
-
Appendixs9_SporeSEMs.zip
-
HybridDatasetsSupplemental.zip
-
README.md
Abstract
Premise: Polypodium pellucidum, a fern endemic to the Hawaiian Islands, encompasses a broad spectrum of morphological and ecological variation, suggesting a complex history involving divergence and hybridization. In contrast to angiosperm systems, spore dispersal in ferns presents a unique opportunity to study how highly dispersible organisms diversify in the dynamic landscape of the archipelago.
Key Results: We infer P. pellucidum is monophyletic, dispersing to the Hawaiian archipelago 11.53 to 7.77 Mya, with diversification into extant clades 5.66 to 4.73 Mya. We identify four non-hybrid clades with unique morphologies, ecological niches, and distributions. Additionally, we elucidate several intraspecific hybrid combinations and evidence for undiscovered or extinct 'ghost' lineages contributing to extant hybrids populations.
Conclusions: We provide a roadmap for revising the taxonomy of P. pellucidum to account for cryptic lineages and intraspecific hybrids. Geologic succession of the Hawaiian Islands through cycles of volcanism, vegetative succession, and erosion has determined the available habitats and distribution of ecologically specific divergent clades within P. pellucidum, with intraspecific hybrids arising as a result of ecological and or geological transitions, often persisting after the local extinction of their progenitors. This research contributes to our understanding of the evolution of Hawaii's diverse fern flora and allows for better conservation efforts that are often complicated by the presence of cryptic taxa and hybridization.
README: Lineage diversification and rampant hybridization among subspecies explain taxonomic confusion in the endemic Hawaiian fern Polypodium pellucidum, Supplemental Data
https://doi.org/10.5061/dryad.gf1vhhmvv
General Description of Supplemental Appendices:
Hybrid Datasets Supplemental: Phylogenetic trees and alignments for the single hybrid datasets.
Appendix S1: ADMIXTURE replicates, Clumpak summary results (i.e., q matrix used at K=3), cross-validation scores, and Evanno’s K results tables.
Appendix S2: Table summarizing results for single hybrid phylogenies generated with SORTER stage 3.
Appendix S3: Phased hybrid phylogeny prior to combining hybrid sequences resulting as monophyletic.
Appendix S4: Table of raw and transformed morphological measurements.
Appendix S5: Table of elevation, growth habit, moisture zone, and habitat type for each sample.
Appendix S6: Table summarizing read statistics and total number of consensus-alleles retrieved by stage 2 of the SORTER pipeline based on GoFlag 408 target-capture probe reference sequences. Read mapping statistics are provided for phylogenetic datasets and ADMIXTURE, respectively.
Appendix S7: Phylogeny resulting from MSC analysis of dataset B. (See Table 1)
Appendix S8: Post-hoc Tukey test results table for ANOVA on spore length
Appendix S9: Spore SEM measurements and images
Appendix S10: Morphological key to the major non-hybrid lineages in Polypodium pellucidum
Description of data and file structure
Hybrid Datasets Supplemental: ZIP file (file name: HybridDatasetSupplemental.zip) containing individual hybrid phylogenies and alignments. Each sample has analyses in individual folder denoted by their sample ID. The 'col' subfolders within each sample folder represent sequential iterations of analyses after combining psuedo-haploblock sequences resulting as monophyletic, if they were required. Samples that did not require sequential iterations are stored in the sample folder itself. The latter sample folders and 'col' folders contain alignment and partition file used to infer gene-trees with IQ-TREE ('.phy' and '.partitions' files, respectively), with IQ-TREE gene-tree inference output ('loci.treefile', see IQ-TREE documentation for details on all output files). Resulting ASTRAL Multi-species Coalescent trees stored in files with the '.tre' extention.
Appendix S1: ZIP file (file name: AppendixS1_Admixture.zip) containing Q-matrix output by CLUMPAK visualized in Fig 1 (File name: K3_Qmatrix_MajorClumpakCluster.txt). ADMIXTURE and CLUMPAK output stored in 'ADMIXTURE_CLUMPAK' folder. This folder contains ADMIXTURE replicates (results.zip, see ADMIXTURE documentation for details) used as input for CLUMPAK analysis. Subfolders beginning with 'K=' contain CLUMPAK summary runs for each K, with the 'CLUMPP.files' folder containing raw Q-matrices output by CLUMPAK for each K run. The 'MCL.files' folder containing CLUMPAK clustering summaries for each K. The 'MajorCluster' and 'MinorCluster' sub-folders hold raw Q-matrix files associated with the largest, and minor clusters (if present) identified by CLUMPAK, with 'distructOutput' files showing a visual representation of the respective Q-matrix (See CLUMPAK and DISTRUCT documentation for details on files output by the software). Statistics for cross validation scores and Evanno's K output is stored in the 'Statistics' folder.
Appendix S2: CSV Table (file name: AppendixS2_HybridResults.csv) summarizing individual hybrid phylogeny results, compared to the final Dataset D phylogeny results with all hybrid samples included.
Appendix S3: ZIP file (file name: AppendixS3_PrecombinedDatasetD.zip) containing Preliminary Dataset D multi-species coalescent tree (file name: st3cl80_AllHybrids_50rep_320loci_precollapsed_ASTRAL.tre) prior to combining psuedo-haploblock sequence sets resulting as monophyletic. Additional files represent gene-trees and other output files generated by IQ-TREE (filename: loci.treefile for gene trees, see IQ-TREE documentation for details on all output files).
Appendix S4: CSV Table (file name: AppendixS4_RawMorph.csv) with raw morphological measurements of vouchers studied. Empty cells represent data that was generated from averages, using the 'POP' column to determine group averages, based on ADMIXTURE and Phylogenetic results.
Appendix S5: CSV Table (file name: AppendixS5_EcoGeo.csv) with elevation, growth habit, moisture zone, and habitat type for each sample. Empty cells represent location or collection information that was not available for a subset of herbarium vouchers.
Appendix S6: CSV Table (file name: AppendixS6_ReadStatistics.csv) containing read statistics and total number of consensus-alleles retrieved by stage 2 of the SORTER pipeline based on GoFlag 408 target-capture probe reference sequences. Read mapping statistics are provided for phylogenetic datasets and ADMIXTURE, respectively.
Appendix S7: ZIP file (file name: AppendixS7_MSCDAtasetB.zip) containing alignment and partition file for Dataset B MSC phylogeny ('.phy' and '.partitions' files, respectively), with IQ-TREE gene-tree inference output. Resulting ASTRAL Multi-species Coalescent tree in NEWICK format is stored in st2cl80_progsetnocont_50rep_413loci_ASTRAL.tre file and a PDF file visualization of the same tree with Local Posterior Support at nodes is represented in the st2cl80_progsetnocont_50rep_413loci_ASTRAL.pdf.
Appendix S8: ZIP file (file name: AppendixS8_SporeMeasurements.csv) containing spore length measurements table and resulting ANOVA and post-hoc Tukey test results (file name: AppendixS8_AnovaTukey.txt).
Appendix S9: ZIP file (file name: AppendixS9_SporeSEMs.zip) containing folders with spore SEM images used in study. Folder names designate sample ID's and their associated major clades. The ZIP file also contains an Excel file (file name: AppendixS9_SporeSEM_measurements.xlsx) with summary and individual spore SEM measurements. See additional sheets within excel file for sample specific measurements. Images ending in '_REFERENCE.tif' can be used to identify specific spores and the measurements taken from the raw image.
Appendix S10: A word file (file name: AppendixS10_MorphologicalKey.docx) containing morphological key to the major non-hybrid lineages we identified in Polypodium pellucidum.
Code/Software
All alignments included in this study were generated by using the 3 stage SORTER bioinformatic pipeline for Target-Capture data. See the GitHub page for documentation of scripts: https://github.com/JonasMendez/SORTER/
Methods
We employed a 408-locus target-capture dataset to investigate the evolution of genetic, morphological, and ecological variation in P. pellucidum. With a broad sampling of in-field collections and herbarium vouchers across five Hawaiian Islands, we integrated genetic, morphological, and ecological analyses to unravel the evolutionary history of P. pellucidum. We identify and characterize hybrid as well as non-hybrid lineages, allowing us to explore the relative influence of geography and ecology on their distributions.