# The improbability of detecting trade-offs and some practical solutions

## Cite this dataset

Johnson, Marc (2024). The improbability of detecting trade-offs and some practical solutions [Dataset]. Dryad. https://doi.org/10.5061/dryad.xpnvx0kq5

## Abstract

Trade-offs are a fundamental concept in evolutionary biology because they are thought to explain much of nature’s biological diversity, from variation in life-histories to differences in metabolism. Despite the predicted importance of trade-offs, they are notoriously difficult to detect. Here we contribute to the existing rich theoretical literature on trade-offs by examining how the shape of the distribution of resources or metabolites acquired in an allocation pathway influences the strength of trade-offs between traits. We further explore how variation in resource distribution interacts with two aspects of pathway complexity (i.e., the number of branches and hierarchical structure) affects tradeoffs. We simulate variation in the shape of the distribution of a resource by sampling 10^{6} individuals from a beta distribution with varying parameters to alter the resource shape. In a simple “Y-model” allocation of resources to two traits, any variation in a resource leads to slopes less than -1, with left skewed and symmetrical distributions leading to negative relationships between traits, and highly right skewed distributions associated with positive relationships between traits. Adding more branches further weakens negative and positive relationships between traits, and the hierarchical structure of pathways typically weakens relationships between traits, although in some contexts hierarchical complexity can strengthen positive relationships between traits. Our results further illuminate how variation in the acquisition and allocation of resources, and particularly the shape of a resource distribution and how it interacts with pathway complexity, makes it challenging to detect trade-offs. We offer several practical suggestions on how to detect trade-offs given these challenges.

## README: The improbability of detecting trade-offs and some practical solutions

https://doi.org/10.5061/dryad.xpnvx0kq5

We performed simulations of resource trade-off pathways with various initial resource allocation distributions and pathway shapes. Using Python, we simulated initial resource distributions according to either a uniform distribution or a beta distribution with various values for parameters alpha and beta. We then randomly sampled resource allocation for each individual according to the various models, as described in the main manuscript.

### Description of the data and file structure

The source code used to generate the data is organized by the resource model distribution type:

- "y_model": Simple y-model where resources are allocated to one of two branches
- "multiple_branchpoints_model": Modified y-model where additional branches are added.
- "hierarchical_model": Complex model where resources from the second branch of a y-model are further distributed into a second y-model.

Within each directory are Python source code files to simulate the model (e.g., "generate_y_model.py") which will output raw .csv files that are used as inputs for plotting. The provided R source code files are used to either generate the plots presented in the paper (e.g., "plot_y_model.R") or to calculate various statistics, such as slope values, about the model (e.g., "calculate_stats_y_model.R"). The calculated statistics are presented in the supplementary tables attached with the manuscript.

The "y_model" directory contains an additional file, "plot_slope_y_model.R", which is used to generate the contour plot when sampling many y-models with different initial resource distributions, as sampled from a beta distribution.

## Methods

*Overview of Flux Simulations*

To study the strength and direction of trade-offs within a population, we developed a simulation of flux in a simple metabolic pathway, where a precursor metabolite emerging from node A may either be converted to metabolic products B_{1} or B_{2} (Fig. 1). This conception of a pathway is similar to De Jong and Van Noordwijk’s Y-model (Van Noordwijk & De Jong, 1986; De Jong & Van Noordwijk, 1992), but we used simulation instead of analytical statistical models to allow us to consider greater complexity in the distribution of variables and pathways. For a simple pathway (Fig. 1), the total flux J_{total} (i.e., the flux at node A, denoted as J_{A}) for each individual (N = 10^{6}) was first sampled from a predetermined beta distribution as described below. The flux at node B_{1} (J_{B1}) was then randomly sampled from this distribution with max = J_{total} = J_{A} and min = 0. The flux at the remaining node, B_{2}, was then simply the remaining flux (J_{B2} = J_{A} - J_{B1}). Simulations of more complex pathways followed the same basic approach as described above, with increased numbers of branches and hierarchical levels added to the pathway as described below under Question 2. The metabolic pathways were simulated using Python (v. 3.8.2) (Van Rossum & Drake Jr., 2009) where we could control the underlying distribution of metabolite allocation. The output flux at nodes B_{1} and B_{2} was plotted using R (v. 4.2.1) (Team, 2022) with the resulting trade-off visualized as a linear regression using the ggplot2 R package (v. 3.4.2) (Wickham, 2016). While we have conceptualized the pathway as the flux of metabolites, it could be thought of as any resource being allocated to different traits.

*Question 1: **How does variation in resource distribution within a population affect the strength and direction of trade-offs?*

We first simulated the simplest scenario where all individuals had the same total flux J_{total} = 1, in which case the phenotypic trade-off is expected to be most easily detected. We then modified this initial scenario to explore how variation in the distribution of resource acquisition (J_{total}) affected the strength and direction of trade-offs. Specifically, the resource distribution was systematically varied by sampling n = 10^{3} total flux levels from a beta distribution, which has two parameters alpha and beta that control the size and shape of the distribution (Miller & Miller, 1999). When alpha is large and beta is small, the distribution is left skewed, whereas for small alpha and large beta, the distribution is right skewed. Likewise, for alpha = beta, the curve is symmetrical and approximately normal when the parameters are sufficiently large (>2). We can thus systematically vary the underlying resource distribution of a population by iterating through values of alpha and beta from 0.5 to 5 (in increments of 0.5), which was done using the NumPy Python package (v. 1.19.1) (Harris* et al.*, 2020). The resulting slope of each linear regression of the flux at B_{1} and B_{2} (i.e., the two branching nodes) was then calculated using the *lm* function in R and plotted as a contour map using the latticeExtra Rpackage (v. 0.6-30) (Sarkar, 2008).

*Question 2: How does the complexity of the pathway used to produce traits affect the strength and direction of trade-offs?*

Metabolic pathways are typically more complex than what is described above. Most pathways consist of multiple branch points and multiple hierarchical levels. To understand how complexity affects the ability to detect trade-offs when combined with variation in the distribution of total flux we systematically manipulated the number of branch points and hierarchical levels within pathways (Fig. 1). We first explored the effect of adding branches to the pathway from the same node, such that instead of only branching off to nodes B_{1} and B_{2}, the pathway branched to nodes B_{1} through to B_{n} (Fig. 1B), where n is the total number of branches (maximum n = 10 branches). Flux at a node was calculated as previously described, and the remaining flux was evenly distributed amongst the remaining nodes (i.e., nodes B_{2} through to B_{n}would each receive J_{2-n} = (J_{total} - J_{B1})/(n - 1) flux). For each pathway, we simulated flux using a beta distribution of J_{total}with alpha = 5, beta = 0.5 to simulate a left skewed distribution, alpha = beta = 5 to simulate a normal distribution, and with alpha = 0.5, beta = 5 to simulate a right skewed distribution, as well as the simplest case where all individuals have total flux J_{total} = 1.

We next considered how adding hierarchical levels to a metabolic pathway affected trade-offs. We modified our initial pathway with node A branching to nodes B_{1} and B_{2}, and then node B_{2} further branched to nodes C_{1} and C_{2} (Fig. 1C). To compute the flux at the two new nodes C_{1} and C_{2}, we simply repeated the same calculation as before, but using the flux at node B_{2}, J_{B2}, as the total flux. That is, the flux at node C_{1} was obtained by randomly sampling from the distribution at B_{2} with max = J_{B} and min = 0, and the flux at node C_{2} is the remaining flux (J_{C} = J_{B2} - J_{C1}). Much like in the previous scenario with multiple branch points, we used three beta distributions (with the same parameters as before) to represent left, normal, and right skewed resource distributions, as well as the simplest case where J_{total} = 1 for all individuals.

*Quantile Regressions*

We performed quantile regression to understand whether this approach could help to detect trade-offs. Quantile regression is a form of statistical analysis that fits a curve through upper or lower quantiles of the data to assess whether an independent variable potentially sets a lower or upper limit to a response variable (Cade* et al.*, 1999). This type of analysis is particularly useful when it is thought that an independent variable places a constraint on a response variable, yet variation in the response variable is influenced by many additional factors that add “noise” to the data, making a simple bivariate relationship difficult to detect (Thomson* et al.*, 1996). Quantile regression is an extension of ordinary least squares regression, which regresses the best fitting line through the 50^{th} percentile of the data. In addition to performing ordinary least squares regression for each pairwise comparison between the four nodes (B_{1}, B_{2}, C_{1}, C_{2}), we performed a series of quantile regressions using the ggplot2 R package (v. 3.4.2), where only the q_{th} quantile was used for the regression (q = 0.99 and 0.95 to 0.5 in increments of 0.05, see Fig. S1) (Cade* et al.*, 1999).

## Funding

Natural Sciences and Engineering Research Council

Canada Research Chairs

Centre for Urban Environments