Data from: Orchidinae-205: A new genome-wide custom bait set for studying the evolution, systematics, and trade of terrestrial orchids
Data files
Aug 12, 2024 version files 336.14 MB
-
orchidinae-205.zip
-
README.md
Abstract
Terrestrial orchids are a group of genetically understudied, yet culturally and economically important plants. The Orchidinae tribe contains many species that produce edible tubers that are used for the production of traditional delicacies collectively called ‘salep’. Overexploitation of wild orchids in the Eastern Mediterranean and Western Asia threatens to drive many of these species to extinction, but cost-effective tools for monitoring their trade are currently lacking. Here we present a custom bait kit for target enrichment and sequencing of 205 novel genetic markers that are tailored to phylogenomic applications in Orchidinae s.l. A subset of 31 markers capture genes putatively involved in the production of glucomannan, a water-soluble polysaccharide that gives salep its distinctive properties. We tested the kit on 73 taxa native to the area, demonstrating universally high locus recovery irrespective of species identity, that exceeds the total sequence length obtained with alternative kits currently available. Phylogenetic inference with concatenation and coalescent approaches was robust and showed high levels of support for most clades, including some which were previously unresolved. Resolution for hybridizing and recently radiated and lineages remains difficult, but could be further improved by analysing multiple haplotypes and the non-exonic sequences captured by our kit, with the promise to shed new light on the evolution of enigmatic taxa with a complex speciation history. Offering a step-up from traditional barcoding and universal markers, the genome-wide custom loci targeted by Orchidinae-205 are a valuable new resource to study the evolution, systematics and trade of terrestrial orchids.
README: Data from: Orchidinae-205: A new genome-wide custom bait set for studying the evolution, systematics, and trade of terrestrial orchids
https://doi.org/10.5061/dryad.sj3tx96bn
This dataset contains the files used to generate the Orchdidinae-205 custom bait set for targeted capture of 205 nuclear genomic loci tailored to the Orchidinae subtribe of the Orchidaceae family. It also contains the exon and intron sequences that were recovered for 82 reference samples in the tribe using the Orchidinae-205 probes.
Description of the data and file structure
The dataset consists of the following directories:
TRANCRIPTOMES: Filtered transcriptome assemblies for 23 Orchidoideae species, generated with Trinity. Peptide sequences (.pep files) correspond to the coding sequences (.cds files) of the putative open reading frames identified by TransDecoder. 14 of these transcriptomes were selected for orthologue inference and probe design.
TARGETS: transcripts of selected orthogroups in the 14 Orchidinae species used for probe design, identified with OrthoFinder. These transcripts were used as a reference file for read assembly to recover target sequences after targeted capture and sequencing.
CAPTURE: exon and intron sequences assembled for 82 Orchidinae samples following targeted capture and sequencing, generated with HybPiper. The exons were used to generate alignments and phylogenetic trees.
Sharing/Access information
The raw reads used to assemble the transcriptomes were retrieved from the Sequence Read Archive under the respective project numbers listed in their file names.
The raw reads used to assemble the exon sequences are deposited in the Sequence Read Archive under project number PRJNA1053613.