Skip to main content

Ant cuticular hydrocarbons are heritable and associated with variation in colony productivity

Cite this dataset

Walsh, Justin; Pontieri, Luigi; d'Ettorre, Patrizia; Linksvayer, Tim (2020). Ant cuticular hydrocarbons are heritable and associated with variation in colony productivity [Dataset]. Dryad.


In social insects, cuticular hydrocarbons function in nestmate recognition and also provide a waxy barrier against desiccation, but basic evolutionary features, including the heritability of hydrocarbon profiles and how they are shaped by natural selection are largely unknown. We used a new pharaoh ant (Monomorium pharaonis) laboratory mapping population to estimate the heritability of individual cuticular hydrocarbons, genetic correlations between hydrocarbons, and fitness consequences of phenotypic variation in the hydrocarbons. Individual hydrocarbons had low to moderate estimated heritability, indicating that some compounds provide more information about genetic relatedness and can also better respond to natural selection. Strong genetic correlations between compounds are likely to constrain independent evolutionary trajectories, which is expected given that many hydrocarbons share biosynthetic pathways. Variation in cuticular hydrocarbons was associated with variation in colony productivity, with some hydrocarbons experiencing strong directional selection. Altogether, this study builds on our knowledge of the genetic architecture of the social insect hydrocarbon profile and indicates that hydrocarbon variation is shaped by natural selection. 


Experimental design and colony maintenance 

We used 48 lab reared M. pharaonis colonies (hereafter “colony genotypes”) of known pedigree from our heterogeneous stock mapping population, which was derived from eight initial lab stock colonies that were systematically interbred for nine generations (Supplementary figure 1, see Walsh et al. 2019 and Pontieri et al. 2017 for details). We split each colony genotype into three equally sized replicates (hereafter “colony replicates”) that initially consisted of 4 queens, 400 ± 40 workers, 60 ± 6 eggs, 50 ± 5 first instar larvae, 20 ± 2 second instar larvae, 70 ± 7 third instar larvae, 20 ± 2 prepupae, and 60 ± 6 worker pupae (Supplementary figure 2). These colony demographics represent a typical distribution found in a relatively small M. pharaonis colony (Buczkowski & Bennett 2009; Warner et al. 2018). 

We maintained all colony replicates on a 12:12 hour light:dark cycle and at 27 ± 1 °C and 50% relative humidity. We fed each colony replicate twice per week with an agar-based synthetic diet (Dussutour & Simpson 2008) and mealworms. Water was provided ad libitum via a glass tube plugged with cotton (Walsh et al. 2018). Colony replicates nested between two glass slides (5 cm x 10 cm) housed in a plastic container (18.5 cm x 10.5 cm x 10.5 cm) lined with FluonⓇ.

Behavioral and colony productivity data collection

We surveyed each colony replicate for five collective behaviors and two measures of colony productivity (Supplementary figure 2). First, we established five assays to collect measurements of the following collective behaviors (see Walsh et al. 2019 for assays design details): 1) foraging (defined as the number of workers that visited a food source in an hour); 2) aggression (the number of aggressive acts observed in an hour towards workers of a second species, Monomorium dichroum); 3) exploratory rate (a measure of how quickly workers disperse from the nest during the exploration of a novel arena, Börger & Fryxell 2012); 4) group exploration (the percent of a novel arena explored by a group of five foragers in 15 minutes) and 5) colony exploration (the percent of a novel arena explored by the entire colony in 15 minutes). 

After we completed the behavioral assays, the queens from each colony replicate were removed to trigger the production of sexuals (Edwards 1991; Warner et al. 2018). We conducted weekly surveys of the number of worker, gyne (i.e. virgin queens) and male pupae produced, until all brood matured. We used the 1) total number of sexual pupae (i.e. gyne + males) and 2) the total number of worker pupae as final measures of colony productivity.  Behavioral and colony productivity data for each colony replicate are reported in Supplementary table 1.

Collection, extraction and analysis of cuticular hydrocarbon samples

Upon completion of behavioral and colony productivity surveys, from each colony replicate we collected three samples, each consisting of 15 workers, for the extraction and analysis of cuticular hydrocarbons (Supplementary figure 2). To extract the cuticular hydrocarbon profile, we swiftly transferred each group of workers into a clean 2 mL glass vial and rinsed in 200 μl of HPLC grade (99%) pentane for 10 min. We injected 3 μl of each extract into an Agilent 6890N gas chromatograph (GC) coupled with a 5375 Agilent mass spectrometer. We identified 34 chemical compounds (Figure 1a) by their retention times, fragmentation patterns and comparison with published results (Schmidt et al. 2010; van Zweden 2014). We integrated the area under each peak using MSD Chemstation. As they had similar retention times, some compounds co-eluted into the same peak (Peak 2: y-C25:1 with Peak 3: n-C25; Peak 27: x-C31:1 with Peak 28: y-C31:1; Figure 1a). We combined the areas of each co-eluting pair (Peak 2+3 and Peak 27+28, respectively), leaving 32 peaks available for statistical analysis. For full details see Supplemental file 1. 

We discarded 175 out of 432 cuticular hydrocarbon samples due to contamination and other technical failures, leaving a total of 257 samples from 111 colony replicates that could be used for statistical analysis. Number of used cuticular hydrocarbon samples per colony replicate is available in Supplementary table 1. 

Statistical analyses

Prior to running any statistical analysis, we normalized the raw hydrocarbon peak areas of each sample using the log-ratio transformation proposed by Aitchison (1982), as has commonly been used in the analysis of hydrocarbon data: 

yij = ln xijg(Xj)

where yij is the transformed peak area of the ith peak of the jth sample, xij is the untransformed area of the ith peak of the jth sample, and g(Xj) is the geometric mean area of all peaks of the jth sample. As in our dataset some samples had zero area values for certain peaks (Supplementary table 1), we added a small constant value to each peak prior applying the transformation.

For each colony replicate, we had a single measure for each of the five collective behaviors and for colony productivity but as many as three measures for hydrocarbon peak values. Therefore, we used the mean hydrocarbon replicate value when estimating phenotypic correlations and selection gradients (see below). 

We performed all statistical analyses in R version 3.4.1 (R Core Team 2014). A detailed R markdown file including the R scripts, as well as a detailed explanation of each analysis, is included as Supplementary file 1. 

Heritability and genetic correlations estimates of cuticular hydrocarbon compounds 

To assess narrow sense heritability (h2) and genetic correlations (rG) of cuticular hydrocarbon compounds, we analyzed our data using the “animal model” approach (Wilson et al. 2010) with the R package “MCMCglmm” (Hadfield 2010). This mixed-effect model uses a Bayesian Markov chain Monte Carlo (MCMC) approach to decompose phenotypic variance into its genetic and environmental components, allowing the estimation of quantitative genetic parameters. In an animal model, the pedigree of many individuals (i.e. “animals”) is used to make inferences about expected patterns of genetic relatedness among the individuals, and together with observed patterns of phenotypic resemblance among individuals, heritability for measured traits, and genetic correlations between traits are estimated. We treated our replicate colonies (i.e. groups of workers sampled from each replicate colony) as “individuals” in an animal model, and the pedigree of each colony traced the queen and male parents of the worker offspring that made up each colony in the mapping population. Using such an approach, we asked to what degree variation in the expected genetic makeup of workers (i.e. based on the pedigree) predicted the observed phenotypic variation in hydrocarbon profile among groups of workers. While the hydrocarbon profile of each individual worker is expected to depend both directly on its own genotype (direct genetic effects) and indirectly on the genotypes of its nestmates (through indirect genetic effects) (Linksvayer 2006, van Zweden et al. 2010), our approach does not enable us to estimate the separate contributions of these direct and indirect genetic effects to the observed composite hydrocarbon profile of our replicate worker groups. That is, we cannot quantify the degree to which the hydrocarbon profile of each individual depends on compounds synthesized by that individual, as opposed to compounds synthesized by social partners, but we can quantify the degree to which phenotypic variation in the hydrocarbon profile of groups of workers is predicted by the genotypic makeup of those workers. Similarly, previous animal breeding studies have shown that the total contribution of direct and indirect genetic effects to total genetic variance and total heritability can be estimated by quantifying phenotypic variation among groups of individuals (Peeters et al. 2013; Brinker et al. 2017), although it is not possible to empirically tease apart the separate contribution of direct and indirect genetic effects. 

We ran Bayesian univariate models to estimate the heritability of each hydrocarbon compound, and bivariate models (one for each pairwise combination of hydrocarbon variables) to calculate genetic correlations between compounds. Univariate and bivariate models had the same random and fixed effect structure, and differed only in terms of priors specification (Supplementary file 1). We included in the models individual ID, environmental variance, and block (samples were collected at different time points) as random factors. Details of model specification are described in Supplementary file 1. 

Linear and quadratic selection gradients estimates

We used the two measures of colony productivity (production of sexual and worker pupae) as definition of fitness, and estimated strength and type of selection (e.g., directional, stabilizing, or disruptive) acting on individual hydrocarbons using the regression approach developed by Morrissey and Sakredjda (2013). Briefly, we first estimated the fitness function relating colony productivity to the abundance of a specific hydrocarbon with a generalized additive model (GAM), using the R package “mgcv” (Wood 2017). Then, linear (β) and quadratic (ℽ) selection gradients are obtained from the fitted GAM model using the function gam.gradients() in the package “gsg” (Morrissey & Sakrejda 2014), which can handle non-normal distributions of fitness. We ran 64 univariate models (32 phenotypic traits × 2 fitness measures). Prior to running the model, hydrocarbon variables were mean-centered and variance standardized. Linear and quadratic selection estimates, case-bootstrapped standard errors and p-values for both definition of fitness are reported in Supplementary table 2. Details of model specification are described in Supplementary file 1. 

Correlation between hydrocarbon compounds and collective behaviors

We ran Spearman’s rank-order correlations to evaluate the strength and direction of association between cuticular hydrocarbons and collective behaviors. We ran a model between each compound and each of the five collective behavior (160 models in total). P-values were adjusted using the false discovery rate (FDR) method.

Random forest classification analysis

We used a random forest classification analysis (“RF”, Breiman 2001) to determine which cuticular hydrocarbon peaks can better discriminate across the 48 colony genotypes. Although this method does not take into account pedigree relationships, it can provide hints about which hydrocarbons are more variable among colony genotypes, thus highlighting compounds that might be involved in inter- and intra specific recognition. We ran a stratified sampling RF classification model with replacement using the R package “randomForest” (Liaw and Wiener 2001), and we considered hydrocarbon samples from colony replicates belonging to the same colony genotype as part of one of the 48 colony genotypes classes. We used the mean decrease in model accuracy (MDA), as suggested by Cutler (2007), to interpret hydrocarbons importance in classifying the colony genotypes. Model details and specifications can be consulted in Supplementary file 1. 


National Science Foundation, Award: IOS-1452520