Data from: Gene flow across large distances in the cavity-nesting wasp Deuteragenia subintermedia in a central European forest
Data files
May 02, 2025 version files 80.43 GB
-
Barcodes.zip
57.64 KB
-
D171_1.fastq.gz
1.33 GB
-
D30_1.fastq.gz
1.35 GB
-
D34_1.fastq.gz
1.35 GB
-
D39_1.fastq.gz
1.59 GB
-
D46_1.fastq.gz
1.77 GB
-
D49_1.fastq.gz
1.38 GB
-
D58_1.fastq.gz
1.21 GB
-
D9_1.fastq.gz
1.46 GB
-
D93_1.fastq.gz
1.64 GB
-
DipPlate1.fastq.gz
33.97 GB
-
DipPlate2.fastq.gz
33.38 GB
-
Gene_flow_across_large_distances_in_the_cavity-nesting_wasp_Deuteragenia_subintermedia_in_a_central_European_forest.xlsx
43.80 KB
-
README.md
9.36 KB
Abstract
Habitat connectivity and maintaining gene flow between populations is central for long term population persistence and is an essential element in conservation planning. However, data on dispersal ability and genetic population structure is lacking for almost all insect species. We here investigate if forest localities in the temperate, central European Black Forest are connected by gene flow. For this, we used partial genome sequencing on specimens of the solitary cavity-nesting wasp Deuteragenia subintermedia (Hymenoptera, Pompilidae), a forest specialist that is primarily nesting in deadwood. We assumed that spatially uneven availability of standing deadwood has led to genetic sub structuring. Contrary to our expectations, we did not find signs of population structure either on a regional or an individual level. Hence, for this solitary wasp species, dispersal seems not to be restricted across the Black Forest study sites (approximately 90 km distance) and none of the investigated environmental variables impacted genetic connectivity. This study was part of the ‘Conservation of forest biodiversity in multiple-use landscapes of Central Europe’ (ConFoBi) framework (Storch et al., 2020). The ConFoBi project investigates how structural retention forestry approaches in the southern Black Forest (Baden-Württemberg, Germany) influences several aspects of biodiversity
Dataset DOI: 10.5061/dryad.ns1rn8q4f
Description of the data and file structure
This dataset is associated with Ruppert et al. 2025, Gene flow across large distances in the cavity-nesting wasp Deuteragenia subintermedia in a central European forest. Evolution and Ecology.
In this study DNA was extracted from Specimens of D. subintermedia, which were collected using trap nests on 134 study plots in the black forest. After trap retrieval in autumn, all collected nests were placed in a cooling chamber at 4°C until February. After this simulated winter diapause, the specimens hatched and were preserved. The first live *Deuteragenia sp. *specimen of every *Deuteragenia *nest in every trap tube was collected directly into 100% ethanol and stored at -20°C. All specimens were catalogued for abundance data. Prior to DNA extraction the specimens were removed from the freezer only once to morphologically identify specimens and ensure that only males were included.
DNA of all specimens was extracted with Qiagen DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) with a standardized protocol following the manufactures recommendations. To ensure that only specimens of D. subintermedia were analysed, we ran a PCR on the COI barcode using the degenerated Meyer Primers dgLCO1490 and dgHCO2198 (Meyer, 2003, https://doi.org/10.1046/j.1095-8312.2003.00197.x). The PCR product was sent for Sanger sequencing to confirm morphological species identifications. The DNA samples which were confirmed by the barcoding were sent for RAD sequencing.
This study was part of the ‘Conservation of forest biodiversity in multiple-use landscapes of Central Europe’ (ConFoBi) framework (Storch et al., 2020, https://doi.org/10.1002/ece3.6003). The ConFoBi project investigates how structural retention forestry approaches in the southern Black Forest (Baden-Württemberg, Germany) influences several aspects of biodiversity. The study area is a continuous cover forest ranging over 5000 km2 with 75% of the landscape covered by forest, which is primarily coniferous with planted Norway spruce (Picea abies L.), which would naturally not occur in high density, and native silver fir (Abies alba Mill.).
The genetic data was investigated in regards to environmental distance with a selection of environmental variables characterising the forest habitat, genetic and geographic distance, a hierarchical analysis of molecular variance (AMOVA), Morans I, Discriminant Analysis of Principal Components and STRUCTURE analysis.
The relationship between abundance of D. subintermedia per plot and the same environmental variables as for the environmental distance was analysed using a generalized linear model (GLM) with a negative binomial distribution.
No signs of population structure either on a regional or an individual level were found.
This dataset contains all raw data necessary to reproduce the analyses described in this study.
Files and variables
1) The excel file (Gene flow across large distances in the cavity-nesting wasp Deuteragenia subintermedia in a central European forest.xlsx) contains 4 spreadsheets: Specimen Collection, Environmental + Abundance Plots, Barcode ID, IndexList for RADdata
Specimen Collection includes the unique identifiers of specimen and research plots which are reocurring in the further data, and the sampling locations in latitude and longitude, as well as the location indication of the trap on the research plot.
Environmental + Abundance Plots includes all used environmental data and abundance data of Deuteragenia subintermedia for the sampling locations, identified by the unique PlotID. With this dataset all abundance and environmental analyses were performed. The environmental data includes the average elevation of the study plots (m a.s.l., derived from a digital terrain model provided by the State Agency of Spatial Information and Rural Development of Baden-Württemberg, lgl-bw.de, 2005), the mean diameter of trees at breast height > 7 cm (DBH) and the sum of basal area of trees calculated from data of a full forest inventory in 2017. Also, the volume of lying and standing deadwood with deadwood sampling conducted along V-transects on each plot, measuring all deadwood structures > 7 cm DBH and minimum height of 1.3 m at the upper slope. Calculated from the same forest inventory data, also included is the percentage of coniferous trees on a plot. Furthermore, derived from remote sensing data included is the canopy closure calculated as the percentage of plot surface lower than 3m above ground and the effective number of layers derived from Terrestrial Laser Scanner LiDAR data, which calculates number of 1-meter thick layers of foliage weighed by how much they are filled (Ehbrecht et al., 2016, https://doi.org/10.1016/j.foreco.2016.09.003). Further, the forest cover in the surrounding 10 km2 (1000 ha/10 km² moving window on Landsat land cover data by Landesanstalt für Umwelt Baden-Württemberg LUBW 2010).
Barcode ID: A lookup table for the ID used during COI Barcoding to identify specimen. Necessary for the provided barcode data files.
IndexList for RADdata: Index list of the Radseq data, necessary for demultiplexing, note that some samples which were excess to the plates are not included because they are available as demultiplexed files already.
Variable names and description:
Plot ID: Plot number (between 1 and 188)
Abundance: Total number of individuals of Deuteragenia subintermedia collected from trap nestsat the SE and NW corner of the plots, details of collection see Rappa et al. 2023 (https://doi.org/10.1016/j.foreco.2022.120709)
average_elevation: average plot altitude, in meters a.s.l.
DBHMean: plot-scale mean of tree Diameter at Breast Height (in cm)
basal_area: basal area per square meter (square meters occupied by stems divided by total square meters of the plot)
lying_deadwood_volume: volume of all forms of lying deadwood per plot, obtained by using a V-shaped line transect through the plot (Storch et al. 2020, https://doi.org/10.1002/ece3.6003)
volume_standing_deadwood: volume of all standing deadwood, obtained by using a V-shaped line transect through the plot (Storch et al. 2020, https://doi.org/10.1002/ece3.6003)
fores_cover_1000: forest cover percentages at moving window 1000ha/10sqkm
%conifer: share of conifers at the plot level (calculated from aggregated forest inventory), calculated as percentage of the total basal area
canopy_closure: percentage of light not reaching the 3-metre level from above (derived from UAV data)
effective_numberof_layers: Effective Number of Layers (ENL, Ehbrecht et al. 2016, https://doi.org/10.1016/j.foreco.2016.09.003, obtained from Terrestrial Laser Scanner LiDAR data); it calculates number of 1-meter thick layers of foliage weighed by how much they are filled; it is higher the highest is the stand height, and the most homogenously and densely layers are filled.
Specimen ID: Unique identifier for collected specimen
Trap Location: Identifier for trap nests for location south-east (SE) or north-west (NW)
Latitude: North–south angular location of collection plot
Longitude: East–west position of collection plot
BarcodeID: Identifier for COI barcoding data
Index: Index for Radseq data
2) Metadata_for_Gene_flow_across_large_distances_in_the_cavity-nesting_wasp_Deuteragenia_subintermedia_in_a_central_European_forest.txt a shortened overview of the variables used in the xlsx file.
3) Barcodes.zip archive of the COI Barcodes as FASTA-files.
4) D30_1.fastq.gz, D34_1.fastq.gz, D39_1.fastq.gz, D46_1.fastq.gz, D49_1.fastq.gz, D58_1.fastq.gz, D93_1.fastq.gz, D171_1.fastq.gz, D9_1.fastq.gz, DipPlate2.fastq.gz, DipPlate1.fastq.gz
These files contain the raw RAD data, which is necessary to replicate all genetic analyses. The individual files ( e.g. D30_1.fastq.gz) are already demultiplexed. The two plates DipPlate1 and DipPlate 2 have to be demultiplexed before use, the identifiers are provided in the xlsx file.
The files D30_1.fastq.gz, D34_1.fastq.gz, D39_1.fastq.gz, D46_1.fastq.gz, D49_1.fastq.gz, D58_1.fastq.gz, D93_1.fastq.gz, D171_1.fastq.gz, D9_1.fastq.gz contain the raw data for one sample each and are named by the Specimen ID of the contained sample, which is the unique identifier for collected specimen through the whole analysis. The same Specimen ID can be found in the excel file.
The files DipPlate2.fastq.gz and DipPlate1.fastq.gz contain the raw data from two sample plates, which each contain several samples, whereas the naming "Dip" refers to the genus Dipogon and "Plate1" and "Plate2" refer to the sample plate. In the excel file the index list for demultiplexing the contained raw data to be split and labelled by their SpecimenID can be found, whereas the applicable index list is labelled with the according name of the plate each.
