Skip to main content
Dryad

Aligned DNA sequences of Vanilla

Cite this dataset

Ellestad, Paige (2022). Aligned DNA sequences of Vanilla [Dataset]. Dryad. https://doi.org/10.5061/dryad.m905qfv3q

Abstract

Premise

Although vanilla is one of the best-known spices, there is a limited understanding of its biology and genetics within Mexico, where its cultivation originated and where phenotypic variability is high. This study aims to augment our understanding of vanilla’s genetic resources by assessing species delimitation and genetic, geographic, and climatic variability within Mexican cultivated vanilla.

Methods

Nuclear and plastid DNA sequence data from 58 Mexican samples collected from three regions and 133 ex-situ accessions were used to assess species monophyly using phylogenetic analyses and genetic distances. Intra-specific genetic variation was summarized through the identification of haplotypes. Within the primarily cultivated species, V. planifolia, haplotype relationships were further verified using plastome and rRNA gene sequences. Climatic niche and haplotype composition were assessed across the landscape.

Key Results

Three species (Vanilla planifolia, V. pompona, and V. insignis) and 13 haplotypes were identified among Mexican vanilla. Within V. planifolia haplotypes, hard phylogenetic incongruences between plastid and nuclear sequences suggest past hybridization events. Eight haplotypes exclusively consisted of Mexican samples. The dominant V. planifolia haplotype occurred throughout all three regions as well as outside of its country of origin. Haplotype richness was found to be highest in regions around Papantla and La Chinantla.

Conclusions

Long histories of regional cultivation support the consideration of endemic haplotypes as landraces shaped by adaptation to local conditions and/or hybridization. Results may aid further genomic investigations of vanilla’s genetic resources and ultimately support the preservation of genetic diversity within the economically important crop.

Methods

DNA sequence datasets used for Bayesian and Maximum Likelihood phylogenetic analyses and K2P genetic distance analyses. All sequences have been aligned and trimmed.

Funding

Lush (United Kingdom)