The significance of genetic distance and nest occupation on the worker-worker similarity of gut bacterial microbiome and cuticular hydrocarbon profile in a sweat bee
Data files
Jun 19, 2025 version files 9.92 KB
-
Halictus_scabiosae_genotypes_dataset.csv
8.31 KB
-
README.md
1.61 KB
Abstract
The cuticular hydrocarbon (CHC) profile and the gut microbiome (GM) are crucial traits which have a significant impact on the life of bees. In honey bees, the CHC profile and the GM interact finely through trophallaxis, such that the characteristics of the GM are partially defined by the chemical recognition among sisters. However, most of the known primitively eusocial bees show simpler social traits, including moderate genetic relatedness among colony members, often due to workers' nest drifting or dispersal, and lack of trophallaxis. Hence, primitively eusocial bees offer a great opportunity to evaluate the respective role of worker-worker genetic relatedness and of the environment in which the adult lives (residency nest) on the interaction between CHC profile and GM. Here, we investigated such relationships in the primitively eusocial digger bee Halictus scabiosae (Halictidae). We found a high rate of nest-drifting by workers, which leads to a consequent highly variable intra-colonial genetic relatedness. Genetically closely related workers, even occupying distant nests, did possess both a more similar microbiome profile and a more similar CHC profile. Additionally, sharing the same nest seemed to account for the similarity of both CHC profile and GM among workers. Interestingly, differences in microbiome profile and in CHC profile were highly and positively correlated across workers, even after controlling for genetic relatedness. The results of our study point towards an impact of genetic relatedness on the GM and the CHC profile, but also suggest that microbiome and CHC profile are partially acquired through adult nest environment, and that microbiome possibly has a role in shaping the cuticular chemistry.
https://doi.org/10.5061/dryad.7wm37pw0w
Description of the data and file structure
This is a dataset of the microsatellite-derived genotypes of sweat bee workers of the species Halictus scabiosae. H. scabiosae workers were collected from two nest aggregations near the small town of Alberese, within the Maremma Regional Park (Tuscany, Italy: 42°40′5″N, 11°6′23″E). The two nest aggregations (hereafter: aggregation 1 and aggregation 2), composed of around 100 (aggregation 1, Casa Gialla Barbicato) and 50 (aggregation 2, Sagrado Agriturismo) nests, were located 1.2 km appart. Five H. scabiosae workers per nest were collected by netting, from a total of 18 nests (7 nests from the aggregation 1, labelled A to G and 11 nests from the aggregation 2, labelled H to U).
Microsatellite alleles are named according to the detected size in base pairs. The presence of a 0 (zero) means that this allele could not be detected, probably due to poor PCR amplification. Column A contains the samples (1 to 5) coded according to nest and aggregation. Column B contains the locations where the aggregations were found. Columns C to R contain the alleles of the microsatellite loci. For example, column C contains the alleles coded by their size in base pairs of the LHMS10 locus, and column D contains the other alleles of the same locus, as these are samples of diploid female bees.
DNA extraction and genotype determination
We studied genetic relatedness between workers using microsatellite markers. DNA extraction was performed from one leg of each individual using Chelex 100 (Walsh et al. 1991). The Chelex extraction method has become a staple in the field of honey bee research due to its rapidity and efficacy in DNA extraction (Evans et al. 2013). This method has been demonstrated to yield DNA of sufficient quality for microsatellite analysis. Consequently, it was selected as the primary extraction method for the samples in this study, and its effectiveness was subsequently validated. Eight microsatellite markers were used in this study, with primers reported in Table S1. Two different multiplex reactions were designed (Multiplex 1: LHMS10, rub73, rub02, rub72; Multiplex 2: rub37b, rub30, rub77, LM27) using forward primers fluorescent-labeled and amplified under the following PCR conditions: initial denaturation at 94°C for 3 min; 35 cycles of 30 s at 94°C, 30 s at 55°C, and 45 s at 64°C; a final elongation for 10 min at 64°C. In order to ascertain the feasibility of the multiplex reactions, a polymerase chain reaction (PCR) test was performed. This involved the amplification of five samples of H. scabiosae, both individually and in combination in the different multiplexes. The annealing temperature was selected based on the articles in which the primers were designed (Kukuk et al. 2002; Paxton et al. 2003; Soro and Paxton 2009). Once the efficacy of the multiplex reaction had been verified, the remaining samples were amplified.
Amplified products were sent for detection to the company Secugen S.L (Madrid, Spain), and the identification of alleles was carried out using GENEMAPPER 3.7 software (Applied Biosystems) to export the data matrix for subsequent analysis.
Statistical analyses
The number of total and private alleles of each nest aggregation, as well as the unbiased expected heterozygosity (uHe) were obtained using GenAlEx 6.5 (Peakall and Smouse, 2006). The presence of null alleles was evaluated with Micro-Checker version 2.2.3 (Van Oosterhout et al., 2004), and the conformity of the populations with the Hardy-Weinberg equilibrium was checked using GENEPOP on the web version 4.1 (Rousset, 2008). Analysis of molecular variance (AMOVA) was carried out based on 999 replicates using GeneAlex 6.5 (Peakall & Smouse 2006) to assess how genetic diversity varied within and between nest aggregations. Genetic distances between individuals (as a proxy of genetic relatedness) were calculated and their differences between populations were evaluated by a Principal Coordinates Analysis (PCoA). Finally, in order to analyze the genetic clustering of the H. scabiosae aggregations, the STRUCTURE v2.3.4 program (Pritchard et al., 2000) was used. All possible existing populations (K; 1-10) (burn-in: 1 x 104; MCMC: 105) were simulated using ten iterations each to obtain the most probable one according to the ∆K value (Evanno et al., 2005) obtained with STRUCTURE HARVESTER web v0. 6.94 (Earl and von Holdt, 2012).
