Skip to main content

Data from: Plastid phylogenomic analysis of Podostemaceae with an emphasis on Neotropical podostemoideae

Cite this dataset

Ruhfel, Brad et al. (2024). Data from: Plastid phylogenomic analysis of Podostemaceae with an emphasis on Neotropical podostemoideae [Dataset]. Dryad.


Podostemaceae are a clade of aquatic flowering plants that form important components of tropical river ecosystems. Species in the family exhibit highly derived growth forms and high vegetative phenotypic plasticity, both of which contribute to taxonomic confusion. The backbone phylogeny of the family remains poorly resolved, many species remain to be included in a molecular phylogenetic analysis, and the monophyly of many taxa remains to be tested. To address these issues, we assembled sequence data for 73 protein-coding plastid genes from 132 samples representing 68 species (~23% of described species) that span the breadth of most major taxonomic, morphological, and biogeographic groups of Podostemaceae. With these data, we conducted the first plastid phylogenomic analysis of the family with broad taxon sampling. These analyses resolved most nodes with high support, including relationships not recovered in previous analyses. No evidence of widespread, well-supported conflict among individual plastid genes and the concatenated phylogeny was observed. We present new evidence that four genera (Apinagia, Marathrum, Oserya, and Podostemum), as well as four species, are not monophyletic. In particular, we show that Podostemum flagelliforme should not be included in Podostemum and is better recognized as Devillea flagelliformis, and that Marathrum capillaceum is embedded within Lophogyne s.l. and should be recognized as Lophogyne capillacea. We also place a previously unsampled and undescribed species that likely represents a new genus. In contrast to previous studies, the neotropical genera Diamantina, Ceratolacis, Cipoia, and Podostemum are resolved as successive sister groups to a clade of all paleotropical Podostemoideae taxa sampled, suggesting a single dispersal event from the neotropics to the paleotropics in the history of the subfamily. These results provide a strong basis for improving the classification of Podostemaceae and a framework for future phylogenomic studies of the clade employing data from the nuclear genome.

README: Data from: Plastid Phylogenomic Analysis of Podostemaceae with an Emphasis on Neotropical Podostemoideae

This Dryad submission contains five (5) files: 1) Concatenated nucleotide alignment, 2) tree file with bootstrap values, 3) HybPiper target sequences, 4) concatenated alignment partition file by gene, and 5) concatenated alignment model and partition file. Files used or output by HybPiper (target file) and IQ-TREE analyses (concatenated alignment, partition and model files, tree file) included. Each file is explained in further detail below.

GenBank numbers for sequences used in these analyses are listed in Appendix 1 of the main publication and are available on GenBank.

Description of the data and file structure

Filename: HybPiper_Targets_nt.fasta

Description: Target file used in HybPiper analyses to generate assemblies from trimmed reads. Sequence names include the taxon name and the gene region. References containing GenBank numbers for target sequences are listed in the Methods sections of the publication.

Filename: allGenes.phy

Description: concatenated nucleotide alignment (trimmed of codons that were missing data in > 40% of the samples using pxclsq in Phyx as stated in Methods). Aligned (trimmed) and concatenated chloroplast gene regions from HybPiper analyses. This alignment was used along with the allGenesConcat.best_scheme.nex file to create the phylogeny in allGenes.Treefile_wBS.tre. Taxon names are followed by voucher collection numbers as noted in the main publication and match names used in figures.  

Filename: Partitions.byGene.txt

Description: partition file by gene for concatenated alignment (allGenes.phy). This file can be used to separate the concatenated alignment into individual alignments of each gene. Each gene name and its position in the concatenated alignment is listed.

Filename: allGenesConcat.best_scheme.nex

Description: best models and partition file for concatenated alignment (allGenes.phy) as output by IQ-TREE. Genes included in each partition, along with their positions in the concatenated alignment, are listed along with the appropriate model for that partition (e.g., GTR+F+I+G4).

Filename: allGenes.Treefile_wBS.tre

Description: Tree file with bootstrap values estimated with IQ-TREE using allGenes.phy and  allGenesConcat.best_scheme.nex for tree shown in main text (Figs. 1 and 2a,b).


National Science Foundation, Award: DEB-1754329, DEB

National Science Foundation, Award: DEB-0444589, DEB

National Science Foundation, Award: DEB-1754199, DEB

National Science Foundation, Award: IOS-2109716, IOS

Eastern Kentucky University, University Research Committee