Data from: Ultraconserved yet informative for species delimitation: UCEs resolve long-standing systematic enigma in Central European bees
Data files
Aug 19, 2020 version files 205.24 MB
-
A_barbareae_cineraria-UCE-seqs.fasta
14.35 MB
-
A_bicolor_group-UCE-seqs.fasta
135.71 MB
-
A_carantonica_trimmerana_rosae-UCE-seqs.fasta
15.89 MB
-
A_dorsata_propinqua-UCE-seqs.fasta
10.40 MB
-
L_alpigenum_bavaricum_cupromicans-UCE-seqs.fasta
22.02 MB
-
N_goodeniana_succincta-UCE-seqs.fasta
6.88 MB
Abstract
UCE library preparation
Whole body DNA extractions were performed overnight in a proteinase K buffer at 56°C and purified using a Qiagen Biosprint 96 extraction robot following the manufacturer’s protocol. Extracts were quantified using Qubit v3 (Thermofisher Scientific) and 50 ng DNA per specimen were sonicated to 500 bp fragment length using a Bioruptor ultrasonicator (Diagenode). Two independent dual-indexed libraries each containing 96 specimens were constructed using a Kapa Hyper prep kit (Roche) using one fourth of the manufacturer’s recommended volumes (as described in Branstetter, Longino, Ward, & Faircloth, 2017). PCR amplifications were performed in the recommended volumes. PCR products were quantified using a Qubit v3 and each row of a 96-well PCR plate were pooled equimolarly (i.e. for total of 8 pools). Libraries were UCE enriched using the Hymenopteran v2 hybridization kit (UCE Hymenoptera 2.5Kv2 Principal/Full, myBaits, Arborbiosci). Each enrichment was performed on a single pool of 12 specimens using 500 ng. The enrichment protocol followed the manufacturer’s recommendations with a hybridization step of 24 h at 65°C, followed by a PCR amplification with 14 cycles. Pools were sequenced on a Miseq using the Illumina v3 kits (2 x 300 bp; Illumina).
Bioinformatic processing of UCE data
Fastq reads were demultiplexed on the Miseq and data from all runs were merged and processed mainly using PHYLUCE tools (Faircloth, 2016). Raw data were cleaned with illumprocessor (Faircloth, 2016), a tool wrapped around trimmomatic (Bolger, Lohse, & Usadel, 2014). Clean reads were assembled with SPAdes v3.12.0 (Nurk et al., 2013) using the single-cell flag (“--sc”), careful option (“--careful”) and a coverage cutoff value of five (“--cov-cutoff”). Obtained contigs were mapped against the corresponding UCE reference file using Lastz (Harris, 2007) and matching reads were extracted and aligned by species complex using MAFFT (Katoh & Standley, 2013). Alignments were edge-trimmed using the PHYLUCE “seq-cap” program; a strategy recommended for closely related species (< 30-50 MYA) (Faircloth, 2016). Loci shared by less than 75% of the maximum number of specimens were filtered out. Remaining alignments were concatenated and saved in fasta format. An additional filtering step was applied to remove specimens with more than 90% missing data.