Skip to main content
Dryad logo

Contrasting multi-level relationships between behavior and body mass in blue tit nestlings


Class, Barbara; Brommer, Jon (2020), Contrasting multi-level relationships between behavior and body mass in blue tit nestlings, Dryad, Dataset,


Repeatable behaviors (i.e. animal personality) are pervasive in the animal kingdom and various mechanisms have been proposed to explain their existence. Genetic and non-genetic mechanisms, which can be equally important, predict correlations between behavior and body mass on different levels (e.g. genetic, environmental) of variation. We investigated multi-level relationships between body mass measured on weeks 1, 2, and 3 and three behavioral responses to handling, measured on week 3, and which form a behavioral syndrome in wild blue tit nestlings. Using 7 years of data and quantitative genetic models, we find that all behaviors and body mass on week 3 are heritable (h2= 0.18-0.23) and genetically correlated, whereas earlier body masses are not heritable. We also find evidence for environmental correlations between body masses and behaviors. Interestingly, these environmental correlations have different signs for early and late body masses. Altogether, these findings indicate genetic integration between body mass and behavior, and illustrate the impacts of early environmental factors and environmentally-mediated growth trajectory on behaviors expressed later in life. This study therefore suggest that the relationship between personality and body mass in developing individuals is due to various underlying mechanisms which can have opposing effects. Future research on the link between behavior and body mass would benefit from considering these multiple mechanisms simultaneously.


Data collection

Data used for these analyses was collected between 2012 and 2018 in a wild population of blue tits breeding in nest boxes in south-west Finland (Tammisaari, 60°01′N, 23°31′E). This population has been monitored yearly since 2003 during the breeding season (first broods from end of April to end of June), following a standard protocol for nest box-breeding passerines (Brommer and Kluen 2012). Nest boxes were visited weekly in May to assess laying dates, clutch sizes and estimate expected hatching dates. Nests from first broods were visited daily starting from their expected hatching date until at least one hatchling was observed (D0). Two days after the hatching day (D2), nestlings were weighed (using a scale with 0.1g precision) and their nails were clipped following unique combinations to allow their identification at later stages of development. Parents were caught and identified when nestlings were at least 5 days old. One week later (D9), nestlings were weighed and banded by putting a metal ring with a unique alphanumeric code on their left leg after their nail code was read. A few days before fledging (D16), nestlings were transferred all together in a large paper bag and various measurements of each nestling were taken following a fixed sequence (cf. Brommer and Kluen 2012). Firstly, each individual was held still on its back in the observer’s palm. Stopwatch was started and the number of struggles during 10 seconds was counted. Docility was expressed as -1 time this number/second. Immediately after this 10 seconds assay, the time each bird took to take 30 breaths was measured twice using a stopwatch. Breath rate was calculated as 30 divided by the average of these two measures and expressed in number of breath/second. A higher breath rate reflects a higher stress response to handling (Carere et al. 2004). The bird’s right tarsus and head-bill length were then measured using a digital sliding caliper (0.1mm accuracy) before measuring its wing and tail length using a ruler (1mm accuracy). During these morphometric measurements, the bird’s aggressive behavior (struggling, flapping wings) was observed and a handling aggression score (1-5) was given to the bird. This score, which is 1 for a completely passive bird and 5 for a bird struggling continuously, reflects the time it takes for each bird to calm down during these measurements. Each nestling was then weighed using a Pesola spring balance (0.1g accuracy) and then placed in a second large paper bag where measured nestlings remained until the entire brood was processed and put back to its nest.

Pedigreed population

Phenotypic data was available for 5404 individuals, which were connected through a social pedigree based on social parenthood. The pruned pedigree, which retains only informative individuals holds record for 6205 individuals, 5464 maternities, 5107 paternities, 25107 full sibs, 43411 maternal sibs, 38543 paternal sibs, 18284 maternal half-sibs, 13416 paternal half-sibs, a mean family size of 10.8, a mean pairwise relatedness of 2.54e-3 and a maximum pedigree depth of 11. In this population, 11% to 22% of offspring produced annually were sired by extra-pair males (unpublished data). Based on simulations (Charmantier & Réale 2005), such level of error in paternity assignment is unlikely to cause substantial biases in quantitative genetic parameters when using the social pedigree.

Quantitative genetic analyses

Quantitative genetic analyses were carried out using animal models, which are mixed effects models that use the relatedness matrix derived from a population pedigree to estimate additive genetic (co) variance (Wilson et al. 2010). Univariate animal models assuming Gaussian distribution were run for each trait separately to estimate their variance components and their ratios to phenotypic variance. Then, a multivariate animal model was run for all six traits to estimate their correlation on various levels. In all models, brood identity, maternal identity, and additive genetic effects were fitted as random effects to estimate (co)variance due to common environment, maternal, and additive genetic effects while fixed effects included time of measurement in minutes and year as continuous and categorical covariates, respectively. For behavioral responses, fixed effects also included observer identity and handling order (continuous). In univariate models, box was fitted as an additional random effect to account for consistent differences between territories. Animal models were solved using Restricted Maximum Likelihood (REML), and implemented in ASReml-R version 3 (Butler et al. 2009; VSN International, Hemel Hempstead, U.K.). Statistical significance of fixed and random effects was tested using conditional Wald F tests and likelihood ratio tests (LRT) with one degree of freedom, respectively. Heritability (h2) of each trait was calculated as the ratio VA /VP where VP, the phenotypic variance, is defined as the sum of the REML estimates of additive genetic effects, maternal, common environment effects and residuals (VA , VPE, and VR respectively) and is conditional on the fixed-effect structure of the model. Correlations between pairs of traits on each level were calculated based on the corresponding covariance matrix estimated by the multivariate animal model. In this multivariate model, each response was corrected by the same fixed effects as in its corresponding univariate model. Random effects box and mother identities were not fitted in this model due to not being estimable for the former, and due to model convergence issues for the latter. Three 4-trait animal models including maternal effects were however fitted to verify that the relationships between each separate mass and behavior on other levels were consistent with the relationships found using the 6 trait animal model excluding maternal effects. Standard errors (SE) of variance ratios and correlations were approximated using the delta method (Fischer et al. 2004). Coefficients of variation (CV=sd*100/mean) were calculated for the different variance components in. All statistical analyses were performed in R (R development core team 2019). Residuals of all animal models were approximately normally distributed (Shapiro-Wilk test values >0.92, Figure S1).


Structural equation models

SEMs were used to investigate, on each level, different hypotheses for the relationships between behavior and body masses at different ages (Figure 1). SEMs have been previously used in behavioral studies to explore the structure of behavioral syndromes using predicted individual values derived from mixed models (Dochtermann & Jenkins 2007, Dingemanse et al. 2010). Here, each covariance matrix estimated by the multivariate model was converted into a correlation matrix, which was used as input data (cf. Class, Kluen and Brommer 2019, Moirón et al. 2019). In all SEMs variance of latent factors was fixed to 1. Because a correlation matrix was used as input data, the residual variance of each indicator (the variance unexplained by the latent factor) was fixed to 1 minus its squared factor loading. Each SEM was fitted in R using the package “lavaan” (Rosseel 2012). Sample size in these models was nominally set at 642 for the common environment level (number of broods) and 5404 (number of individuals) for the residual level and the all SEMs were compared using AIC. The sample size in a SEM will not affect the inferred loadings or correlations between latent variables but can impact their uncertainty and the model AIC. We verified that the model rankings were similar if sample size was assumed to be lower.

Parametric bootstrap simulations were conducted to estimate median and 95% confidence intervals (CI) the model’s loadings and assess model selection uncertainty. Because our findings indicated that the genetic covariance matrix was much reduced (see results), we focused on the common-environment and residuals covariances. Multivariate data for the 6 traits was simulated 1000 times using the inferred common environment and residual covariance matrices to generate a simulated dataset of the same dimension as the observations. Each simulated data was analyzed using a multivariate mixed model. At each iteration, and on each level, SEMs were run based on the estimated correlation matrices and ranked by AIC. Model selection uncertainty was assessed by calculating bootstrap selection rates (Lubke et al. 2017), which indicate whether model ranking is consistent to sampling variability. The selection rates have no a priori cut-off value for “significance”, but instead provide an indication of model selection uncertainty (Lubke et al. 2017). For example, a selection rate of a SEM of 50% would indicate it is the top model in half the simulations, and 100% would suggest consistent support for this one SEM hypothesis over the others in all simulations. R code for performing SEMs and simulations are provided in Text S1.


Academy of Finland, Award: 289456