Skip to main content
Dryad logo

Simulated dataset from 'Quantifying the causal pathways contributing to natural selection'


Henshaw, Jonathan (2020), Simulated dataset from 'Quantifying the causal pathways contributing to natural selection', Dryad, Dataset,


This dataset (Antechinus.csv) relates to the worked example in the appendix of the paper:

Henshaw JM, Morrissey MB, Jones AG (2020). Quantifying the causal pathways contributing to natural selection. Evolution (doi:​10.1111/evo.14091)

The worked example concerns a hypothetical study of female antechinus. These are small carnivorous marsupials that use torpor to reduce energy consumption from late summer to early winter. They reproduce once per year in late winter or early spring, following which most individuals die. We suppose that researchers tracked female antechinus from mid summer to the end of the breeding season. They recorded the animals' body size, their date of last torpor, whether they survived to breed, their number of mates, and their fecundity. I simulated the dataset resulting from this hypothetical study in Wolfram Mathematica (see Methods below). In the above paper, we analyse the causal structure of natural selection in this dataset (the R code for the causal analysis is included here as 'AntechinusAnalysis.R').

The variables in the dataset are: body size in unspecified units (BodySize); the date of last torpor (TorporDate), standardized as the number of days before/after an unspecificed reference date; whether the individual survived to breed, given as a binary variable (Survival); an individual's number of mates (Mates); and her fecundity (Fecundity).


This dataset was simulated in Wolfram Mathematica version using the following code, which has been uploaded to Dryad alongside the dataset (note that the details of generating this entirely hypothetical dataset are not important for understanding the worked example in the accompanying paper):

(* Functions that generate Poisson- and Bernoulli-distributed random variables with non-negative parameters *)

PoissonInteger[lambda_] := 
  If[lambda > 0, RandomInteger[PoissonDistribution[lambda]], 0];

BernoulliInteger[p_] := 
  If[0 < p < 1, RandomInteger[BernoulliDistribution[p]], 
   If[p >= 1, 1, 0]];

(* Sample size*)

n = 10000;

(* Simulate body size as a random normal variable *)

BodySize = Round[RandomReal[NormalDistribution[100, 5], n], .1];

(* Simulate torpor date as a random normal variable that is correlated with body size; round off to nearest integer *)

TorporDate = 
  Round[(BodySize - 100)*1/5 + 
    RandomReal[NormalDistribution[0, 4], n], 1];

(* Simulate survival to breeding as a Bernoulli random variable that depends on torpor date and body size *)

Survival = 
  BernoulliInteger /@ (-TorporDate/100 + (BodySize - 100)/150 + 0.6);

(* Simulate the number of mating as a Poisson integer depending on body size; the number of mates is necessarily zero for individuals that do not survive to breed *)

Mates = Survival*(PoissonInteger /@ (BodySize/30));

(* Simulate fecundity as a Poisson random variable depending on body size and the number of mates *)

Fecundity = 
    BodySize[[i]]*Mates[[i]]/(Mates[[i]] + 1)/20], {i, 1, n}];

(* Store results in a table *)

DataTable = 
  Join[{{"BodySize", "TorporDate", "Survival", "Mates", "Fecundity"}},
    Transpose[{BodySize, TorporDate, Survival, Mates, 

Usage Notes

The following files are included:

- The code used to generate the dataset (written for Wolfram Mathematica version

- The code used to analyse the causal structure of selection in the dataset (written for R version 1.1.456).


Royal Society