Angiosperms353 data for the epidendroid orchid base
Data files
Sep 25, 2025 version files 87.57 MB
-
Code_Epibase_2025_Dryad.txt
41.37 KB
-
filtered_concat.fasta
25.07 MB
-
README.md
4.45 KB
-
trimal_min1004471.fna.fasta_prank.best.zip
2.61 MB
-
unfiltered_concatenated.fasta
59.84 MB
Abstract
Parasites present fascinating examples of evolutionary modification that simultaneously pose challenges for systematics. This is exemplified by fully mycoheterotrophic orchids, which are completely dependent on fungi, constituting nearly half of all mycoheterotrophic plant species. A large concentration of mycoheterotrophic lineages is found among the eight tribes comprising the base of the megadiverse orchid subfamily Epidendroideae, here referred to as the early diverging Epidendroideae (EDE). To date, relationships among the EDE have been problematic. Previous analyses have suffered from sparse taxon sampling, weak support from limited loci, or long-branch attraction in plastid-based analyses. We conducted the most comprehensive nuclear phylogenomic analysis of the EDE to date, using Angiosperms353 loci, coalescent analyses, and deep exploration of support, conflict, saturation, and introgression. Our study is the first to include phylogenomic sampling from all eight EDE tribes, with 22 of 26 EDE genera represented. We took a novel approach, selecting best-fit mixture model configurations at the individual locus level, which provided significantly better fit overall and required fewer parameters than all other models, with implications for clades characterized by lineage-specific rate heterogeneity. We recovered strong support for monophyly of all EDE tribes except for Neottieae, which were inferred to be paraphyletic. Information content was generally rich for deep relationships among the EDE tribes, but overall support was weak. We found evidence of saturation and putative introgression, with two inferred reticulation events. We conclude that short internal branches associated with rapid diversification, incomplete lineage sorting, and putative introgression resulted in low concordant signal among EDE tribes, underscoring the continued difficulty in resolving their relationships. Nonetheless, we provide the first strongly supported phylogenetic hypothesis for the five genera of Gastrodieae, representing the largest known diversification of fully mycoheterotrophic plants. We discuss our findings considering recent phylogenomic studies, taxonomy, morphology, and biogeographic implications.
Dataset DOI: 10.5061/dryad.q2bvq83xt
Supplementary data files for "Nuclear Phylogenomic Insights Into Relationships, Support, and Conflict Among the Early Diverging Lineages of the Megadiverse Epidendroid Orchids."
Craig F. Barrett1*, John V. Freudenstein2, Samuel V. Skibicki1, Brandon T. Sinn3,4, Hana L. Thixton-Nolan1, William J. Baker5, Vincent S. F. T. Merckx6,7, Oscar Alejandro Pérez-Escobar5, Matthew C. Pace8, Paul M. Peterson9, Kenji Suetsugu10,11, Tomohisa Yukawa12
1Department of Biology, West Virginia University, 5218 Life Sciences Building, 53 Campus Drive, Morgantown, West Virginia, USA 26506
2Department of Evolution, Ecology, and Organismal Biology, Ohio State University, 1315 Kinnear Road, Columbus, Ohio, USA 43212.
3Department of Biology and Earth Science, Otterbein University, 1 South Grove Street, Westerville, Ohio, USA.
4Faculty of Biology, University of Latvia, Jelgavas iela 1, Riga, LV-1004, Latvia
5Royal Botanic Gardens, Kew TW9 3AB, Surrey, UK
6Naturalis Biodiversity Center, Darwinweg 2, Leiden, 2333 CR, The Netherlands
7Institute of Biology Leiden, Leiden University, Sylviusweg 72, Leiden, 2333 BE, The Netherlands
8New York Botanical Garden, 2900 Southern Boulevard, Bronx, New York, New York, USA 10458
9Department of Botany, National Museum of Natural History, Smithsonian Institution, Washington, DC, 20560-0166, USA
10Department of Biology, Graduate School of Science, Kobe University, Kobe, 1-1 Rokkodai, Nada-ku, Kobe, Hyogo 657-8501, Japan.
11The Institute for Advanced Research, Kobe University, Kobe, 1-1 Rokkodai, Nada-ku, Kobe, Hyogo 657-8501, Japan.
12Tsukuba Botanical Garden, National Museum of Nature and Science, Amakubo, Tsukuba 305-0005, Japan.
Description of the data and file structure
File 1: Concatenated data (filtered_concat.fasta, filtered to remove sites with > 50% missing data) from 318 genes generated by the Angiosperms353 sequence capture kit.
File 2: Concatenated data (unfiltered_concatenated.fasta) from 318 genes generated by the Angiosperms353 sequence capture kit.
File 3: Zip archive of all 318 individual gene alignments generated by the Angiosperms353 sequence capture kit (trimal_min1004471.fna.fasta_prank.best.zip).
Files and variables
File: trimal_min1004471.fna.fasta_prank.best.zip
Description: Zip archive of all 318 individual gene alignments generated by the Angiosperms353 sequence capture kit (trimal_min1004471.fna.fasta_prank.best.zip).
This is a .zip archive of individual Angiosperm353 gene alignments. Each file has a unique name.
Angiosperms353 loci were processed with CAPTUS v1.1.0 (https://edgardomortiz.github.io/captus.docs/), a flexible environment for handling sequence capture and genome skimming data that cleans, assembles, extracts, aligns, and filters the input data including putative paralogs (Supplementary methods S3). We further filtered the resulting alignments to remove taxa represented in fewer than 50 total alignments and filtered out loci that contained fewer than 40 taxa. We also filtered out taxa that had less than 10% sequence representation in total base pairs based on the concatenated dataset. We then realigned all loci individually with the phylogenetically aware program PRANK v170427 (Löytynoja, 2014).
File: filtered_concat.fasta
Description: Concatenated data (filtered_concat.fasta, filtered to remove sites with > 50% missing data) from 318 genes generated by the Angiosperms353 sequence capture kit.
File: unfiltered_concatenated.fasta
Description: Concatenated data (unfiltered_concatenated.fasta) from 318 genes generated by the Angiosperms353 sequence capture kit.
File: Code_Epibase_2025_Dryad.txt
Description: All code used in the manuscript (text file format). The code processes .fna files to count taxon occurrences, keeps only those present in ≥50 alignments, and filters out specific unwanted taxa (listed in taxa_to_remove) from the output FASTA files.
Access information
Other publicly accessible locations of the data:
- Raw data were submitted to the US National Center for Biotechnology Information Sequence Read Archive under BioProject PRJNA1262965, with BioSample accessions SAMN48490488- SAMN48490521 (https://www.ncbi.nlm.nih.gov/).
