Skip to main content

Data from: A spectacular new species of Habenaria (Orchidaceae) from southern Brazilian Amazon

Cite this dataset

Aguiar Nogueira Batista, Joao; Engels, Mathias; Salgado, Thaynara (2020). Data from: A spectacular new species of Habenaria (Orchidaceae) from southern Brazilian Amazon [Dataset]. Dryad.


A new Habenaria species from the state of Mato Grosso is described and illustrated. Habenaria gracilisegmenta was discovered in campinarana sub-forest, in northern Mato Grosso state, on the southern edge of the Brazilian Amazon. The species is distinguished by its slender habit, few, delicate flowers, and very long, thin lateral segments of the petals and lip, and its morphological affinity with other Neotropical species is unclear. A molecular phylogenetic analysis revealed that the new species does not belong to any of the Neotropical subclades of the genus, constituting an additional lineage. The species is one of two of the genus endemic to the Brazilian Amazon, and considered threatened by the small number of known populations and restricted distribution.


Taxon Sampling for Phylogenetic Analyses—The datasets for the phylogenetic analyses were basically the same as those used by Batista et al. (2013) to infer the phylogenetic relationships of the New World Habenaria, although a few other species addressed in later works were included (Batista et al. 2014, Pedron et al. 2014), and most of the Old World taxa and duplicate terminals were excluded. The data set consisted of the combined ITS and partial matK DNA sequences of 167 terminals of approximately 159 Neotropical Habenaria taxa, corresponding to 53% of the total number of species known from the Neotropical region (Batista et al. 2011a, 2011b), and four African Habenaria species. Gennaria diphylla Parlatore was used as the outgroup. Voucher information, geographic origins, and GenBank accession numbers can be found in Batista et al. (2013, 2014) and Pedron et al. (2014). GenBank accession numbers for the newly sequenced accession of H. gracilisegmenta are provided in Appendix 1.

Molecular Markers—Nucleotide sequences from one nuclear (ITS) and one plastid (matK) genome region were analyzed. The ITS regions consisted of the 3’ and 5’ ends of the 18S and 26S ribosomal RNA genes, respectively, internal transcribed spacers (ITS1 and ITS2), and the intervening 5.8S gene of the nuclear ribosomal multigene family. For the matK gene, we used an internal fragment of approximately 630 bp. DNA extraction, amplification, and sequencing were carried out following standard protocols, as described by Batista et al. (2013). Bidirectional sequence reads were obtained for all of the DNA regions, and the resulting sequences were edited and assembled using Staden Package software (Bonfield et al. 1995). The edited sequences were aligned using MUSCLE (Edgar 2004), and the resulting alignments were manually adjusted using MEGA6 software (Tamura et al. 2013).

Phylogenetic Analyses—The data were analyzed using parsimony and Bayesian inferences. Searches were performed only with a combined matrix, as no cases of strongly supported incongruences were detected in previous analyses with the same datasets (Batista et al. 2013). Phylogenetic analyses using maximum parsimony (MP) were performed using PAUP* version 4 (Swofford 2002) with Fitch parsimony (equal weights, unordered characters, Fitch 1971) as the optimality criterion. Each search consisted of 1000 replicates of random taxon additions, with branch swapping using the tree-bisection and reconnection (TBR) algorithm, saving ≤ 10 trees per replicate to avoid extensive swapping on suboptimal islands. Internal support was evaluated by character bootstrapping (Felsenstein 1985) using 1000 replicates, simple addition, and TBR branch swapping, saving ≤10 trees per replicate. For bootstrap support levels, we considered bootstrap percentages (BS) of 50%–70% as weak, 71%–85% as moderate, and >85% as strong (Kress et al. 2002).

Bayesian analysis was performed using MrBayes v. 3.1.2 (Ronquist et al. 2005) as implemented in the Cyberinfrastructure for Phylogenetic Research (CIPRES) Portal 2.0 (Miller et al. 2010), treating each DNA region as a separate partition. An evolutionary model for each DNA region was selected using the Akaike information criterion (AIC) in MrModeltest 2 (Nylander 2004), and the GTR + I + G model was selected for both data sets. The unlink command was used to unlink parameters among each partition. Each analysis consisted of two independent runs, each with four chains, for 10,000,000 generations, sampling one tree every 1000 generations. To improve chain swapping, the temperature parameter for heating the chains was lowered to 0.1 in the combined analysis. Convergence between the runs was evaluated using the average standard deviation of split frequencies (< 0.01) and was achieved after 2,125,000 generations. After discarding the first 2,500 trees (25%) as the burn-in, the remaining trees were used to assess topology and posterior probabilities (PPs) in a majority-rule consensus. Posterior probabilities (PPs) in Bayesian analysis are not directly comparable to bootstrap percentages, being generally much higher (Erixon et al. 2003). We therefore used criteria similar to a standard statistical test, considering groups with PPs > 0.95 as strongly supported, groups with PPs ranging from 0.90–0.95 as moderately supported, and groups with PPs < 0.90 as weakly supported.