Data and code from: Chemical cues in a bee kleptoparasite trigger an evasive response in a facultative eusocial orchid bee
Data files
Mar 12, 2026 version files 65.84 KB
-
AlkenesPeakAreas_v2.csv
3.71 KB
-
BroodCellsIntruders.csv
396 B
-
CHC_analysis_complete.R
38.16 KB
-
DurationResponse.csv
3.64 KB
-
GlobalCHCPeakAreas_newMbeecheeii_individuals.csv
6.74 KB
-
LatencyAttemptToFlee_BioassaysType.csv
662 B
-
LatencyResponse.csv
867 B
-
README.md
11.67 KB
Abstract
Brood parasites often evade host detection through chemical mimicry or chemical insignificance, whereas alternative strategies such as chemical deterrence are rarely documented. We used the host-parasite model of the orchid bee Euglossa viridissima and its specialised kleptoparasite, the megachilid Hoplostelis bivittata, to study this relationship in primitively eusocial bees. We hypothesised that H. bivittata employs a chemical strategy to enter the nest and avoid host aggression, and that response rapidity increases with chemical distance. To test this, the cuticular hydrocarbon profiles of host and parasite were first compared, followed by bioassays analysing the response of host bees to conspecifics, the kleptoparasite, and the stingless bee Melipona beecheii as a control. Bioassays involved live and frozen specimens, as well as dummies coated with parasite cuticular extracts of the different species. Notably, host E. viridissima showed aggression toward conspecifics and M. beecheii, but consistently fled from live and frozen H. bivittata, including dummies covered in the parasite’s cuticular extract. We conclude that cuticular hydrocarbons in H. bivittata, mediate an evasive response in its host E. viridissima allowing nest parasitation. This is the first evidence of deterrence in a Neotropical brood-parasitic bee, expanding our understanding of chemical mediation in host-parasite arms-race interactions in primitively eusocial orchid bees.
This dataset contains data and R code used to analyse the presence of chemical deterrents in a brood parasite and the behavioural response in a facultative eusocial orchid bee (Euglossa viridissima).
📁 Contents
Data Files
| File Name | Description |
|---|---|
GlobalCHCPeakAreas_newMbeecheeii_individuals.csv |
Raw peak area data for all 34 cuticular hydrocarbon (CHC) compounds across all individuals. |
AlkenesPeakAreas_v2.csv |
Peak areas of monoene (alkene) compounds in CHC profiles (subset of global CHC data). |
LatencyResponse.csv |
Latency to respond (in seconds) when residents confronted different intruder types. |
DurationResponse.csv |
Duration of response behaviours (in seconds) in bioassays with different intruder types. |
LatencyAttemptToFlee_BioassaysType.csv |
Latency to attempt to flee behaviour comparing bioassays with live individuals, frozen individuals, and cuticular extracts. |
BroodCellsIntruders.csv |
Number of brood cells in nests of resident Euglossa viridissima and presence/absence of different intruder types. |
Code Files
| File Name | Description |
|---|---|
CHC_analysis_complete.R |
Complete R script for all statistical analyses and figure generation. |
📊 Main Analyses
1. Cuticular Hydrocarbon (CHC) Analysis
Data files: GlobalCHCPeakAreas_newMbeecheeii_individuals.csv, AlkenesPeakAreas_v2.csv
1.1 CHC Class-Level Analysis (Figure 1)
- Raw CHC peak areas normalised to relative abundances
- Individual compounds grouped into three chemical classes:
- n-Alkanes (16 compounds: C21–C36)
- Monoenes (13 compounds with one double bond)
- Alkadienes (5 compounds with two double bonds)
- Compositional data analysis using:
- Zero imputation with compositional zero-multiplicative (CZM) method
- Centred log-ratio (CLR) transformation
- One-way ANOVA for each CHC class
- Levene's test for homogeneity of variances
- Tukey HSD post-hoc tests with compact letter display (CLD)
1.2 Individual CHC Compound Analysis (Table S4)
- Non-parametric analysis of 16 selected CHC compounds
- Kruskal-Wallis tests comparing relative abundances among species
- Dunn's post-hoc tests with Benjamini-Hochberg correction
- Mann-Whitney U tests for specific pairwise comparisons
1.3 Multivariate Analysis (Figure 2A & 2B)
- ANOSIM (Analysis of Similarities) using Bray-Curtis dissimilarity
- Global ANOSIM with all 34 CHCs
- Global ANOSIM with monoenes only
- Pairwise ANOSIM for all group comparisons
- SIMPER (Similarity Percentages) to identify compounds contributing most to group differences
- NMDS (Non-metric Multidimensional Scaling) ordination
- Figure 2A: NMDS using all 34 CHCs
- Figure 2B: NMDS using monoenes only
2. Latency Analysis (Figure 3)
Data file: LatencyResponse.csv
- Compares latency to respond (time to first response) among three intruder types:
- Conspecific (E. viridissima)
- Heterospecific stingless bee (M. beecheii)
- Brood parasite (H. bivittata)
- Statistical approach:
- Generalised Linear Mixed Model (GLMM) with negative binomial distribution (nbinom1)
- Random effects: Nest ID and Bioassay day
- Model diagnostics using DHARMa (dispersion and zero-inflation tests)
- Post-hoc pairwise comparisons with Tukey correction
- Compact letter display for significance groupings
- Visualisation:
- Figure 3A: Violin plots with individual data points and significance letters
- Figure 3B: Correlation between chemical distance and latency (Spearman ρ = 0.10, p = 0.67)
3. Duration Analysis (Figure 4)
Data file: DurationResponse.csv
- Compares duration of response (total time responding) among three intruder types
- Statistical approach:
- Generalised Linear Mixed Model (GLMM) with negative binomial distribution (nbinom2)
- Random effects: Nest ID and Bioassay day
- Model diagnostics using DHARMa
- Post-hoc pairwise comparisons with Tukey correction
- Compact letter display for significance groupings
- Visualisation:
- Figure 4A: Violin plots with individual data points and significance letters
- Figure 4B: Correlation between chemical distance and duration (Spearman ρ = 0.21, p = 0.046)
4. Additional Bioassays
Data file: LatencyAttemptToFlee_BioassaysType.csv
- Compares latency to attempt to flee among three bioassay types:
- Live individuals
- Frozen individuals
- Cuticular extracts on dummy
- GLMM with negative binomial distribution
- Random effect: Nest ID
- Post-hoc Tukey tests
Data file: BroodCellsIntruders.csv
- Logistic regression models examining relationship between number of brood cells and intruder presence
- Separate models for:
- Global response (all intruders combined)
- Conspecific intruders (E. viridissima)
- Heterospecific intruders (M. beecheii)
- Brood parasite (H. bivittata)
🛠 Software Requirements
R Version
- R ≥ 4.4.2 (tested with "Pile of Leaves")
Required R Packages
# Data manipulation and tidying
dplyr
tidyr
forcats
stringr
# Compositional data analysis
compositions
zCompositions
# Statistical tests
car # Levene's test
multcompView # Compact letter display
glmmTMB # Generalised linear mixed models
DHARMa # Model diagnostics
emmeans # Estimated marginal means
dunn.test # Dunn's test
FSA # Fisheries stock assessment
rcompanion # CLD for non-parametric tests
# Multivariate analysis
ecodist # Distance matrices
vegan # ANOSIM, SIMPER, NMDS
# Visualisation
ggplot2
ggbeeswarm
ggpubr
scales
Installation
install.packages(c(
"dplyr", "tidyr", "forcats", "stringr",
"compositions", "zCompositions",
"car", "multcompView", "glmmTMB", "DHARMa", "emmeans",
"dunn.test", "FSA", "rcompanion",
"ecodist", "vegan",
"ggplot2", "ggbeeswarm", "ggpubr", "scales"
))
📝 Data Dictionary
GlobalCHCPeakAreas_newMbeecheeii_individuals.csv
| Column | Description | Data Type |
|---|---|---|
| Column 1 | Individual ID | Character |
| Column 2 | Species classification | Factor |
| Column 3 | Intruder type | Factor |
| Columns 4-37 | Peak areas for 34 CHC compounds | Numeric |
CHC Compounds:
- n-Alkanes: C21, C22, C23, C24, C25, C26, C27, C28, C29, C30, C31, C32, C33, C34, C35, C36
- Monoenes: C25_1_1, C25_1_2, C26_1_1, C26_1_2, C27_1_1, C27_1_2, C29_1_1, C29_1_2, C30_1, C31_1, C32_1, C33_1, C35_1
- Alkadienes: C29_2, C31_2_1, C31_2_2, C33_2_1, C33_2_2
LatencyResponse.csv
| Column | Description | Data Type |
|---|---|---|
| Bioassay | Intruder type presented | Factor (E.viridissima, M.beecheii, H.bivittata) |
| Latency | Time to first response (seconds) | Numeric |
| Nest | Nest identifier | Factor |
| Bioassay_day | Day of bioassay | Factor |
| Chemical_Distance | Manhattan City-block distance | Numeric |
DurationResponse.csv
| Column | Description | Data Type |
|---|---|---|
| Bioassay | Intruder type presented | Factor (E.viridissima, M.beecheii, H.bivittata) |
| Duration | Total duration of response (seconds) | Numeric |
| Nest | Nest identifier | Factor |
| Bioassay_day | Day of bioassay | Factor |
| Chemical_Distance | Manhattan City-block distance | Numeric |
LatencyAttemptToFlee_BioassaysType.csv
| Column | Description | Data Type |
|---|---|---|
| Bioassay_type | Type of bioassay | Factor (live_individuals, frozen_individuals, Dummy_extract) |
| Attempt_escape_nest | Time to attempt escape (seconds) | Numeric |
| Nest | Nest identifier | Factor |
BroodCellsIntruders.csv
| Column | Description | Data Type |
|---|---|---|
| Cells_number | Number of brood cells in nest | Numeric |
| Response | Overall presence/absence of any intruder | Binary (0/1) |
| E.viridissima | Presence/absence of conspecific intruder | Binary (0/1) |
| M.beecheii | Presence/absence of heterospecific intruder | Binary (0/1) |
| H.bivittata | Presence/absence of brood parasite | Binary (0/1) |
📚 Statistical Methods Summary
Compositional Data Analysis
- Transformation: Centred log-ratio (CLR)
- Zero imputation: Compositional zero-multiplicative (CZM) method
- Reference: Aitchison, J. (1986). The Statistical Analysis of Compositional Data
ANOVA and Post-hoc Tests
- Test: One-way ANOVA (for CLR-transformed data)
- Variance homogeneity: Levene's test
- Post-hoc: Tukey HSD with compact letter display
- Package:
multcompView::multcompLetters4()
Non-parametric Tests
- Test: Kruskal-Wallis rank sum test
- Post-hoc: Dunn's test with Benjamini-Hochberg correction
- Pairwise: Mann-Whitney U test (Wilcoxon rank-sum)
Generalised Linear Mixed Models (GLMM)
- Distribution: Negative binomial (nbinom1 or nbinom2)
- Link function: log
- Random effects: Nest ID, Bioassay day
- Package:
glmmTMB - Diagnostics: DHARMa residual diagnostics
Multivariate Analysis
- Distance: Bray-Curtis dissimilarity
- Tests: ANOSIM (9,999 permutations)
- Ordination: NMDS (100 iterations)
- Contribution analysis: SIMPER (999 permutations)
-
Raw data: All raw peak area data are provided without modification
-
Normalisation: All normalisations and transformations are documented in the script
