Skip to main content
Dryad logo

Phylogenetic patterns of trait and trait plasticity evolution: Insights from tadpoles


Relyea, Rick; Hammond, John; Stephens, Patrick (2021), Phylogenetic patterns of trait and trait plasticity evolution: Insights from tadpoles, Dryad, Dataset,


Environmental heterogeneity has led to widespread evolution of phenotypic plasticity in all taxonomic groups. Although phenotypic plasticity has been examined from multiple perspectives, few studies have examined evolutionary patterns of plasticity within a phylogeny. We conducted common-garden experiments on 20 species of tadpoles, spanning three families, exposed for 4 weeks to a control, predator cues, or reduced food (i.e., increased intraspecific competition). We quantified tadpole activity, growth, and relative morphology and found widespread differences in species responses to predator cues and reduced food. We detected pervasive phylogenetic signals in traits within each environment, but phylogenetic signal was much less common in the trait plasticities. Among different models of continuous character evolution, Brownian Motion and Ornstein Uhlenbeck models provided better fits to the data than the Early Burst model. Tadpole activity level in predator environments had much higher evolutionary rates than in the control and reduced-food environments; we did not see this pattern in the other traits. In comparing traits versus trait plasticities, activity evolved much faster than the plasticity of activity whereas morphological traits evolved much slower than morphological plasticities . Collectively, these results suggest that traits and trait plasticities can exhibit dramatically different evolutionary patterns.


See paper for more details and citations 


As noted in Relyea et al. (2018), there are two major challenges when conducting phylogenetic approaches that involve quantifying traits from experiments. The first challenge is deciding how to select a population to represent a species, particularly when the species has a very wide geographic distribution. Most phylogenetic analyses make the assumption that while population-level variation commonly exists, it is relatively small compared to interspecific variation in traits and trait plasticity. Since it is not feasible to run experiments on multiple populations for every species, we took a compromise approach in which we selected out populations from near the middle of each species’ geographic range in an attempt to avoid extreme phenotypes at the edges of a species’ range. We also examined the importance of population-level variation by examining a second population for two cosmopolitan species (Rana sylvatica and Hyla versicolor).

The second challenge in examining traits across a large number of species is that each species in nature lives under different conditions. For example, tadpoles live under a wide range of temperature, pH, conductivity, dissolved oxygen, and food resources. If we were to raise each species in its most common set of environmental conditions, we would be confounding genetic differences with environmental conditions. The common approach in experimental phylogenetic studies is to raise all species under common-garden conditions and manipulating the trait-inducing environmental factors (e.g., Pigliucci et al. 1999, 2003, Colbourne 1997, Thaler and Karban 1997, Richardson 2001a, b, 2002a, b, Relyea et al. 2018). A concern with this approach is whether a species’ inducible trait expression is conditional on the other environmental factors. However, it is not feasible to run plasticity experiments on dozens of species across a wide range of different environmental conditions. To obtain some sense of how important other environmental conditions might be to our conclusions, we reared three of our species under two different temperatures.

In addition to these two challenges, there also needs to be a consideration of the taxa being used in a phylogeny for estimating evolutionary rates. In this study and all other studies examining evolutionary rates, the taxa examined are always a subset of all possible taxa that could be used and this reality has the potential to bias any estimates of evolutionary rates. This is an important issue if one wanted to compare evolutionary rates across different studies. It is likely less of an issue when one wants to compare evolutionary rates among different traits within a given phylogeny or if one wants to compare evolutionary rates of traits versus trait plasticities within a given phylogeny. In these latter situations, one might assume that whatever bias might exist in using the subset of taxa is a bias that applies to all of the traits and trait plasticities. In our study, we examined 20 species of common North American anurans from three families:  toads (Bufonidae), tree frogs (Hylidae), and true frogs (Ranidae). As an important caveat, our sample represents a clustered subset of the 54 families and 7,299 species of anurans in the world (

Tadpole experiments

All animals were collected as newly oviposited egg masses from around the United States (Table A1). Once collected, eggs were transported back to the local laboratory and sent to the Pymatuning Laboratory of Ecology (PLE; University of Pittsburgh, Linesville PA U.S.A). Upon arrival at PLE, eggs were divided and allowed to hatch in covered outdoor 120-L mesocosms and fed rabbit pellets ad libitum until individuals reached a safe handling size and were used in the experiment. In all experiments, we did not control for potential transgenerational effects on plastic responses, which would require rearing the 20 species in captivity for multiple generations with generation times of 1 to 3 years.

Each tadpole experiment was conducted using a completely randomized design with three treatments and six replicates of each treatment for a total of 18 experimental units, which were split across three spatial blocks (i.e. the second, third, and fourth shelves on a five-shelf rack) that were all quite similar in temperature. The experimental units were plastic bins (26 × 38 × 14 cm) containing 7 L of UV filtered well water and an inverted plastic cup (207 mL) with a mesh covering that served as a predator cage. To determine whether the individuals exhibited plastic responses to the three environments, we waited until the tadpoles reached a safe handling size (mass = 30-50 mg, Gosner stage 26-27) and then separated 200 individuals from each collection of clutches. In all experiments, we randomly assigned 10 individuals to each of the 18 replicates, massed the two additional sets of 10 individuals to determine food ration (see below), and then preserved a sample of 10 of those individuals in 10% formalin to confirm the developmental stage of the tadpoles at the start of each experiment (Gosner 1960).

Our three treatments were 1) high food and no predator cues, 2) low food and no predator cues, and high food and non-lethal predator cues. The food level for each treatment was generated by feeding each replicate a per-capita mass of ground Tetramin fish flakes (Tetra, Blacksburg, Virginia USA) every 2 d. At the beginning of the experiment, an initial mass was determined by averaging the mass of the two groups of 10 individuals. The high resource treatments were given a per capita mass of 4% every 2 d, while the low resource treatment was given 2% per capita every 2 d. The resources were updated on days 7, 14, and 21 of the experiment by generating a weighted average of the mass of one replicate from each treatment. To ensure that the growth response across all the treatments was equally weighted, we used a weighted average because two of the treatments received high food while the other received low food. Previous research has demonstrated that reducing food rations in half has the same impact on tadpole growth as doubling the intraspecific density of tadpoles on a fixed rood ration; some morphological traits also responded similarly, but activity and other morphological traits responded differently to reduced food versus increased intraspecific density (Relyea 2002a). Since reduced food is a major component of intraspecific competition in tadpoles, we hereafter refer to our low-food treatment as a manipulation of intraspecific competition.

To generate non-lethal predator cues, a single, locally collected Anax junius larva was placed in the inverted mesh cups in tubs assigned the predator-cue treatment. Predator cues were generated by feeding 300 ± 5 mg of conspecific individuals to each A. junius every time tadpoles were fed. This concentration of predator cue has been shown to saturate the plastic response of tadpoles (Relyea 2004, Schoeppner and Relyea 2008). Predators always consumed the tadpoles within the 2 d, most often within minutes (personal observation JIH). For the toad tadpoles, we rotated predators every 2 d to allow more time for them to overcome any ill effects of toad toxins.

We chose A. junius as the predator because it has a large range in North America, it is correctly perceived as a high-risk predator, and it can induce significant plastic changes in tadpoles (Relyea 2001a, b, 2003). The implicit assumption is that the anti-predator responses induced by chemical cues of conspecifics being consumed by A. junius would be similar to those introduced by other predatory dragonflies that coexist with many species of tadpoles across North America. The reality is that there is probably no single predator species that coexists with every tadpole species or population in North America.

During the experiments, individuals were examined daily for health and survival. Each experimental unit was fed every 2 d and a complete water change and tub cleaning happened every 4 d. Behavioral assays were conducted on days 7, 14, 15, and 28 by using scan sampling every 5 min over a 50-min timespan. Each tub was observed 10 times and each observation quantified the number of active (i.e. moving) tadpoles. Using the proportion of active tadpoles, we used the mean proportion of active tadpoles as our behavioral response variable for each tub.

The experiments ended either on day 28 (to allow substantial growth and development) or when individuals of two species (A. americanus and P. feriarum) had reached a point of survival concern or approached metamorphosis by day 23. Final survival in the experiments ranged from 46% to 96% (median survival = 81%). All surviving individuals were euthanized using an overdose of tricaine methanesulfonate (MS-222) and then preserved in 10% buffer formalin. We then weighed each preserved individual took a lateral, ventral, and close up image of its mouth with a digital imaging system. The mouth image was taken with the individual resting on its dorsal surface and a glass slide was pressed against the mouth to see the full width of the mouth.

Additional experiments manipulating temperature

To assess the impact of different rearing temperatures on our conclusions, we examined the effects of different temperatures on the traits and trait plasticities of the tadpoles. To do this, we tested three of the species (one species from each family: A. americanus, P. regilla, and R. clamitans) in identical experiments as described above, with the following changes. Instead of blocking the experimental units by shelf height, we ran two complete sets of the experiment on the first and fifth shelf on a five-shelf rack). The highest shelf was approximately 21 °C and the lowest shelf was 18 °C.

Statistical methods for assessing how each species responded to the environment

To determine how each species respond to the predator and competitor environments, we analyzed tadpole growth, activity, and relative morphology. To quantify growth, we used daily growth rate rather than final mass because two species (A. americanus and P. feriarum) were terminated a few days early to avoid having them metamorphose. Growth rate was analyzed with a one-way analysis of variance (ANOVA). Tadpole activity was analyzed using a repeated measured ANOVA to determine the effects of treatment, time, and their interaction. We included block and block-by-treatment interactions in the initial analyses, but then removed the block-by-treatment interactions since none were significant.

To quantify the relative (i.e. mass-adjusted) morphology of the tadpoles, we began by running MANCOVAs on ln-transformed morphological data that included ln-transformed tadpole mass as a covariate. We did not observe any mass-by-treatment interactions, suggesting that the relationships between mass and morphology were parallel among the treatments. We saved all residual values for each individuals and added them to the estimated marginal means to obtain a mass-independent estimate of relative morphology for each individual. We then calculated the mean morphology of the tadpoles from each experimental unit and conducted one MANOVA on the tail dimensions (i.e. tail depth and length; tail muscle depth and width) and another MANOVA on the body dimensions (i.e. body depth, length, and depth; mouth width). We included block and block-by-treatments interactions in the initial analyses; for only two species were the block-by-treatment interactions significant at the multivariate level, but the interaction was never significant at the univariate level. One species (P. regilla) did not meet the assumption of the MANOVA, so we rank transformed the data.

Statistical methods for quantifying trait plasticity

Following the protocol of Relyea et al. (2018), we derived multiple measures of plasticity to determine whether our results were robust to using different measures. We began by quantifying plasticity as an absolute measure (e.g., trait mean for the control treatment - trait mean for the predator or competitor treatment). We also quantified plasticity to be proportional to the grand mean (e.g., [trait mean for the control treatment - trait mean for the predator or competitor treatment ] ÷ grand mean). Next, we quantified plasticity to be proportional the pooled standard deviation, using the meta-analytic Hedge's G (e.g., [trait mean for the control treatment - trait mean for the predator or competitor treatment] ÷ pooled standard deviation; Hedges 1981). In these cases, SD1 is the pooled standard deviation for the control and predator environments and SD2 is the pooled standard deviation for the control and competitor environments. These plasticity measures include both direction and magnitude.

Given that some of the above plasticity measures used the grand mean or a pooled standard deviation as a denominator, we decided it was important to determine whether these denominators contained any phylogenetic signal. To do this, we assessed phylogenetic signal in the grand mean and each of the pooled standard deviations. In doing so, we could assess the contribution of signal from both the numerator and denominator.

Phylogenetic comparative analyses

To examine phylogenetic patterns in plastic traits, we used the tree of Pyron and Wiens (2013), a time-calibrated phylogeny of 2,871 species based on nine nuclear and three mitochondrial gene loci (Pyron and Wiens 2011).  We then pruned this tree down to the 20 species included in our study. Phylogenetic signal exists when trait disparity among species is correlated with phylogenetic distance (i.e. more closely related species have more similar traits). Following the protocol of Relyea et al. (2018), we tested for phylogenetic signal using Blomberg’s K (Blomberg et al. 2003), which is the ratio of observed phylogenetic dependence to the phylogenetic dependence expected under a Brownian motion model of character evolution (Blomberg et al. 2003). The parameter K varies between zero and ∞. Values of zero indicate phylogenetic independence, values 0 < 1 indicate weaker phylogenetic dependence than expected under Brownian motion, and values > 1 indicate greater phylogenetic dependence than expected under Brownian motion. We tested for statistical significance of phylogenetic signal using the randomization test of Blomberg and Garland (2002), implemented in the R library picante (Kembel et al. 2010). Tests were performed using 10,000 randomizations. In all cases, we consider p-values significant at values of ≤ 0.05 and nearly significant at values at 0.05 > p < 0.10. A lack of statistically significant signal signifies a lack of phylogenetic constraints in the sense that the trait values of close phylogenetic relatives have no tendency to be more similar than those of distant relatives. However the presence of phylogenetic signal alone provides no evidence for constraints in the more restricted sense of a lack of response to selective pressure or lower than expected evolutionary rates.

We also quantified evolutionary rates and the fit of three models of continuous trait evolution. In addition to Brownian motion (BM), we also evaluated the fit the Ornstein-Uhlenbeck (OU) model of character evolution (Hansen 1997, Butler and King 2004), which models character evolution as a random walk with a central tendency (i.e., a stabilizing force), and the Early Burst (EB) model where evolutionary rates decrease exponentially over time (Harmon et al. 2010). All three models were fit fitted using the R package Geiger (Harmon et al. 2008), and evolutionary rates were quantified based on σ2. For the OU model, we also estimated a, which can be used to measure the evolutionary half-life of a trait (i.e. a measure of how fast species would be expected to approach the evolutionary optimum; Cooper et al. 2016).

As discussed in Relyea et al. (2018), using the measures of traits and trait plasticities has an inherent problem with species that have larger traits and trait plasticities being correlated with higher rates of evolution (see Figure SA2 in Relyea et al. 2018). To adjust for this bias, one could log-transform the values (Ackerly 2009); however, our measures of plasticity often contain negative values for some species, so they cannot be log transformed. Alternatively, one could consider transforming the absolute values of trait plasticity, but that would discard information on plasticity direction, and greatly inflates rate estimates for trait plasticity that had positive or negative values close to zero. To resolves these challenges, we divided all trait and trait plasticities by the minimum value that any species exhibited among the three treatments. This caused the trait values and trait plasticities to be weighted by the size of the species traits. We used these ratio-transformed traits and trait plasticities to estimate evolutionary rate. This yielded trait ranges that were wide when the maximum species trait values were multiples of minimum trait values, but that also preserved information on directionality. As mentioned in Relyea et al. (2018), evolutionary rates for raw traits and plasticities are highly correlated with the order of magnitude over which traits vary.

For one trait (tadpole activity), we had one species (R. palustris) with a trait value of 0 ± 0% in one environment (i.e. the predator environment). In this case, we tried replacing the value with the lowest value of activity shown by the other species to allow it be included in the analysis (0.01 ± 0.004). We also performed phylogenetic analyses for activity with this species excluded. Both methods yielded the same qualitative outcome with respect to hypotheses tested. Here, we report results using all species.


National Science Foundation, Award: DEB 07-16149

National Institute of General Medical Sciences, Award: K12GM088021