Data from: A new terrestrial species of Peperomia (Piperaceae) from the Cordillera Oriental, Colombia, and its phylogenetic affinity
Data files
Apr 09, 2026 version files 3.69 MB
-
paper1_manual.fasta
705.15 KB
-
paper1_manual.fasta.contree
12.31 KB
-
paper1_pasta.aln
1.71 MB
-
paper1_pasta.aln.cln0.contree
12.31 KB
-
paper1_pasta.aln.cln05
619.93 KB
-
paper1_pasta.aln.cln05.contree
12.31 KB
-
paper1_pasta.aln.cln10
597.12 KB
-
paper1_pasta.aln.cln10.contree
12.32 KB
-
README.md
10.46 KB
Abstract
Peperomia is among the most species-rich angiosperm genera, yet several regions in the Neotropics, particularly in Andean countries, remain insufficiently explored and continue to harbor undescribed species. We describe and illustrate Peperomia clandestina, a new species from the Cordillera Oriental of the Colombian Andes. Although it is morphologically similar to Peperomia ventricosicarpa, P. clandestina is readily set apart by its much sparser branching (simple to 2–3-branched rather than 4–9-branched), consistently shorter petioles (0.4– 1.1 vs. 1.5–2.4 cm), abaxial leaf blade punctation that is yellow rather than black, a higher number of secondary veins per side of the leaf (3–6 vs. 1–2), longer spikes that curve apically (6.1–7.5 vs. 3–5 cm, erect), and floral bracts that are yellow-dotted or inconspicuous rather than distinctly black-dotted. Phylogenetic analyses using three plastid Sanger loci (trnK intron, matK, and trnk-psbA spacer) were used to place the new species into the Peperomia phylogeny. Across multiple alignment-trimming strategies, the 5% occupancy alignment yielded the highest nodal support and was selected for interpretation. Phylogenetic analyses placed P. clandestina in the molecularly defined, but morphologically heterogeneous Clade E, as the sister of the set of taxa previously sampled in that clade. This finding underscores the need for an integrative approach to Peperomia systematics, combining expanded molecular datasets with additional lines of evidence to better understand relationships within Clade E and other poorly resolved lineages.
Associated manuscript:
Jiménez, J. E., T. H. Murphy, B. Villanueva-Tamayo & L. C. Majure.
A New Terrestrial Species of Peperomia (Piperaceae) from the Cordillera Oriental, Colombia, and its Phylogenetic Affinity
Dryad DOI: https://doi.org/10.5061/dryad.p8cz8wb4x
Principal Investigator Contact Information
Name: José Esteban Jiménez
Institution: University of Florida / Florida Museum of Natural History, Email: gaiadendron.jej@gmail.com
Alternate Contact Information
Name: Lucas C. Majure, Institution: Florida Museum of Natural History, University of Florida, Email: lmajure@floridamuseum.ufl.edu
Dataset Overview
The dataset I am uploading to Dryad contains the alignment files and phylogenetic tree files generated for a Sanger-based plastid phylogeny used to place the newly described species Peperomia clandestina in context with the taxon-rich dataset of Frenzke et al. (2015). The data include alignments of the trnK intron, matK gene, and trnK–psbA spacer, which were extracted, cleaned, and processed through standard phylogenetic workflows (MAFFT, PASTA, Phylip trimming) and used to infer maximum-likelihood phylogenies in IQ-TREE2. These files correspond to the final trimmed alignments and resulting tree topologies used in the manuscript to evaluate the phylogenetic placement of P. clandestina and to compare it with the 194 accessions from Frenzke et al. (2015).
This dataset contains the alignment and tree files used to infer the plastid phylogeny in which the new species Peperomia clandestina is placed in the context of the most taxon-rich phylogeny of Peperomia from Frenzke et al. (2015). The plastid region comprises the trnK intron, matK gene, and trnK–psbA intergenic spacer.
The dataset includes:
4 multiple-sequence alignments corresponding to different trimming strategies applied to the concatenated plastid region (5% occupancy, 10% occupancy, manual trimming, and gaps/ambiguity-only trimming).
4 maximum-likelihood phylogenetic trees inferred from each of these alignments with IQ-TREE2, including branch support values (nonparametric bootstrap).
The phylogenies were used to evaluate the placement of Peperomia clandestina within Clade E of Peperomia and to compare the resulting topology to that of Frenzke et al. (2015).
Dates of Data Collection
Field collection of P. clandestina material: 2023
DNA extraction and sequencing (genome skimming): 2023–2024
Alignment, trimming, and phylogenetic analyses: 2024–2025
Description of the Data and File Structure
Note: Replace the placeholder filenames below (in ALL_CAPS.ext) with the exact names of the 8 files in your Dryad deposit.
Alignment Files (4 files)
All four alignments are concatenated plastid datasets including trnK intron + matK + trnK–psbA spacer for 195 sequences (190 Peperomia taxa + 4 outgroups + P. clandestina). Sequences from Frenzke et al. (2015) were downloaded from GenBank and combined with newly generated data for P. clandestina.
Alignments were initially produced with MAFFT E-INS-i, refined with PASTA, visually checked in AliView, and then trimmed using different strategies implemented in Phyx (pxclsq) or manual curation.
- paper1_pasta.aln.cln05
Content: Concatenated plastid alignment trimmed with a 5% occupancy threshold using pxclsq (Phyx). Sites with < 5% non-ambiguous, non-gap characters were removed.
Purpose: This alignment was selected as the primary dataset for phylogenetic interpretation in the manuscript because it yielded the highest average node support while preserving a topology congruent with Frenzke et al. (2015).
Format: Multiple-sequence alignment in FASTA/PHYLIP/NEXUS format (as in your upload).
Approximate characteristics (as reported in the manuscript):
195 sequences
Alignment length: ~3144 bp
Missing data: ~14.1%
1635 variable sites; 1172 parsimony-informative sites
- paper1_pasta.aln.cln10
Content: Concatenated plastid alignment trimmed with a 10% occupancy threshold using pxclsq. Sites with < 10% non-ambiguous, non-gap characters were removed.
Purpose: Sensitivity analysis to evaluate the effect of a more conservative trimming threshold on tree topology and support values. This alignment produced a topology that deviates more noticeably from that of Frenzke et al. (2015).
Format: Multiple-sequence alignment (FASTA/PHYLIP/NEXUS).
- paper1_manual.fasta
Content: Concatenated plastid alignment that was manually trimmed. Problematic or ambiguously aligned regions were removed based on visual inspection in AliView (e.g., regions with alignment uncertainty or long stretches of indels).
Purpose: To check whether a manual, morphology-guided curation of ambiguous regions would alter the topology or support relative to automated trimming approaches.
Format: Multiple-sequence alignment (FASTA/PHYLIP/NEXUS).
- paper1_pasta.aln
Content: Concatenated plastid alignment where only sites composed entirely of gaps and/or ambiguous characters were removed. No occupancy threshold was applied beyond this minimal filter.
Purpose: Represents the least aggressively trimmed dataset, preserving almost all informative variation. Used to compare with other alignments and assess the robustness of clade recovery under minimal trimming.
Format: Multiple-sequence alignment (FASTA/PHYLIP/NEXUS).
Tree Files (4 files)
All four trees were inferred with IQ-TREE2 using maximum likelihood. For each alignment described above, IQ-TREE2:
Used ModelFinder (-m TEST) to select the best-fit substitution model based on Bayesian Information Criterion (BIC).
Estimated branch support with 100 nonparametric bootstrap replicates (-b 100).
Output a Newick tree with branch lengths and bootstrap support values at internal nodes.
All analyses were run on the HiPerGator high-performance computing cluster (University of Florida Research Computing).
- paper1_pasta.aln.cln05.contree
Source alignment: ALIGNMENT_5PCT_OCCUPANCY.fasta
Content: Maximum-likelihood tree for the 5% occupancy dataset, including P. clandestina, 190 Peperomia taxa, and four outgroup taxa.
Key features:
Selected substitution model: K3Pu+F+I+G4 (as reported for the main analysis).
Mean bootstrap support: ~79.8%; median: ~89%.
Peperomia clandestina recovered in Clade E, as a strongly supported (BS ≈ 92%) lineage sister to previously sampled Clade E taxa.
Use in manuscript: This is the primary phylogeny discussed and illustrated (e.g., Fig. 1 in the manuscript).
- paper1_pasta.aln.cln10.contree
Source alignment: ALIGNMENT_10PCT_OCCUPANCY.fasta
Content: Maximum-likelihood tree inferred using the 10% occupancy alignment.
Key features:
Used to evaluate the effect of more conservative trimming on topology.
Shows more pronounced topological differences relative to Frenzke et al. (2015), particularly in deeper relationships among major clades.
Purpose: Comparative tree to assess robustness and sensitivity of the phylogenetic results to trimming thresholds.
- paper1_manual.fasta.contree
Source alignment: ALIGNMENT_MANUAL_TRIM.fasta
Content: Maximum-likelihood tree inferred from the manually trimmed plastid alignment.
Key features:
Topology largely congruent with the 5% occupancy tree and with Frenzke et al. (2015).
Recovering the six major clades (A–F) with similar relationships, and placing P. clandestina in Clade E.
Purpose: To test whether manual removal of dubious regions yields results comparable to automated trimming, supporting the stability of P. clandestina’s placement.
- paper1_pasta.aln.cln0.contree
Source alignment: ALIGNMENT_GAPS_AMBIG_ONLY_TRIM.fasta
Content: Maximum-likelihood tree inferred from the minimally trimmed alignment (only all-gap/ambiguous sites removed).
Key features:
Generally similar topology to the 5% occupancy and manually trimmed trees.
Used to verify that more inclusive character sampling does not substantially alter the recovery of major clades or the placement of P. clandestina.
Purpose: Provides an additional robustness check, representing a “maximally inclusive” dataset with minimal trimming.
Methodology (Summary)
Taxon sampling and sequence origin
190 Peperomia taxa + 4 outgroup taxa from Frenzke et al. (2015), downloaded from GenBank (plastid trnK intron, matK, trnK–psbA spacer).
Newly generated plastid sequence for Peperomia clandestina based on a deep genome-skimming dataset (Illumina NovaSeq X) from silica-gel-dried tissue.
Genome skimming and locus extraction
DNA extracted using a modified CTAB protocol (Doyle & Doyle 1987).
Raw Illumina reads cleaned and filtered with the Captus v1.5.9 pipeline (clean function based on BBduk in BBTools).
De novo assembly with MEGAHIT v1.2.9; read mapping and depth estimation with Salmon.
Plastid loci (trnK intron, matK, trnK–psbA) extracted using BLAT, with Peperomia hadrostachya (KR003042) as a reference.
Alignment and trimming
Initial alignment with MAFFT E-INS-i.
Further refinement using PASTA to handle difficult-to-align regions.
Visual inspection and minor adjustments in AliView.
Trimming strategies:
5% and 10% occupancy thresholds using Phyx (pxclsq).
Manual trimming by visual inspection.
Minimal trimming, removing only all-gap/ambiguous sites.
Phylogenetic inference
Maximum-likelihood analyses performed in IQ-TREE2:
Model selection with ModelFinder (-m TEST, BIC).
Node support assessed with 100 nonparametric bootstrap replicates (-b 100).
All analyses were executed on the HiPerGator computing cluster at the University of Florida.
Trees visualized and edited in FigTree.
Sharing / Access Information
License: This dataset is released under CC0 1.0 Universal (CC0 1.0) Public Domain Dedication, unless otherwise stated in the Dryad record.
Recommended Citation
Jiménez, J. E., T. H. Murphy, B. Villanueva-Tamayo & L. C. Majure. A New Terrestrial Species of Peperomia (Piperaceae) from the Cordillera Oriental, Colombia, and its Phylogenetic Affinity (in review).
Jiménez, J. E. 2025. Phylogenetic data for Peperomia clandestina (Piperaceae) from the Cordillera Oriental, Colombia. Dryad Digital Repository. https://doi.org/10.5061/dryad.p8cz8wb4x
