Data from: A multi-locus plastid phylogeny of the Aulonemia clade (Poaceae: Bambusoideae: Bambuseae: Arthrostylidiinae) reveals three new genera of bamboo
Data files
May 15, 2024 version files 796.90 KB
Abstract
Arthrostylidiinae (Poaceae: Bambusoideae), a subtribe of Neotropical woody bamboos with diverse morphology, comprises 200 species classified in 16 genera. Previous studies supported monophyly of the subtribe and recovered four major internal clades, however, some genera were found to be polyphyletic while others, like Aulonemia and Colanthelia, were either undersampled or not included. Aulonemia and Colanthelia are complex both in their taxonomy and morphology, and exhibit overlapping morphological characters. Prior morphological and molecular analyses suggested they share a close relationship, with Colanthelia emerging as monophyletic and either nested within Aulonemia or sister to it, but these studies sampled relatively few species of each genus. The aims of this study were to increase taxon sampling to test the monophyly of Aulonemia and Colanthelia, to investigate the relationships within the Aulonemia + Colanthelia clade, and to revise their classification as appropriate towards a natural classification of the Arthrostylidiinae. We present a multi-locus plastid phylogeny of the Arthrostylidiinae with emphasis on Aulonemia and Colanthelia. We used sequences of seven plastid markers (one coding: ndhF; six non-coding: trnC-rpoB, rps16-trnQ, trnT-trnL, rps16, trnD-trnT, and rpl16) from 67 taxa of Bambusoideae including all genera of Arthrostylidiinae. Phylogenetic trees were inferred using both Bayesian and maximum likelihood methods. Aulonemia was confirmed as polyphyletic and Colanthelia was not supported as monophyletic. The phylogenetic position of Myriocladus within Arthrostylidiinae is resolved for the first time. All species of Colanthelia were recovered within the clade containing most species of Aulonemia. Four species of Aulonemia (A. radiata, A. effusa, A. setosa, and A. setigera) grouped in other clades within the subtribe and these placements combined with morphological evidence support the establishment of three new genera: Quixiume, Stelanemia and Vianaea, to accommodate the four remarkable Aulonemia species. An updated key for the genera of the Arthrostylidiinae is provided, as well as taxonomic treatments for the three new genera, including the description of a new species in Stelanemia.
README: Alignment files containing DNA sequence data of Aulonemia bamboos and allies
https://doi.org/10.5061/dryad.stqjq2c9t
DNA sequence alignment files and scripts used for inference of a plastid phylogeny of the neotropical woody bamboos.
Description of the data and file structure
Nexus file containing DNA sequence alignments. File include character sets for plastid loci: rpL16-trnQ, trnC-rpoB, trnD-trnT, rpL16, ndhF, rpS16, and trnT-trnL.
Code/Software
SangerContigPipeline.py
a Python v.3 script to automate alignment of raw Sanger-sequenced contigs extracted from ab1 electropherogram files. Contigs with matching accession names but differing in the last character (‘F’ for forward or ‘R’ for reverse) are sequentially aligned and the total base pair difference is calculated. The positioning that minimizes this difference is taken to be the optimal alignment. Two arguments control the relative position for the initial alignment (percent) and a lower end for the difference calculation (tolerance) to prevent trivial differences from being returned (e.g., when the contigs overlap only in their last ‘tolerance’ number of bases).
Input:
a folder containing ab1-format files each named for their accession but differing in the last character of their filename (before file extension).
Output:
a FASTA-formatted text file with the contigs aligned relative to each other at their optimally determined positions.
SequenceMerge.py
a Python v.3 script to automate assembly of consensus sequences from aligned contigs.
Input:
a FASTA-formatted alignment file with strictly 2 contigs per accession, these ordered sequentially with the forward-end of the sequence (reverse contig) first. IUPAC ambiguity codes are introduced when contig base differs for a position; when one contig has a code of ‘N’ or a gap ‘-‘, but the other has a non-degenerate base code (‘A’, ‘C’, ‘G’, or ‘T’), the non-degenerate code is used for the consensus sequence.
Output:
an unaligned FASTA-formatted text file containing one consensus sequence for each accession.