Data from: Navigating “tip fog”: Embracing uncertainty in tip measurements
Data files
Mar 27, 2025 version files 1.24 GB
-
BeaulieuOMeara_tipfog.tar.gz
1.24 GB
-
README.md
4.63 KB
Abstract
Nature is full of messy variation, which serves as the raw material for evolution. Overlooking this variation not only weakens our analyses but also risks selecting inaccurate models, generating false precision in parameter estimates, and creating artificial patterns. Furthermore, the complexity of uncertainty extends beyond traditional “measurement error,” encompassing various sources of variance. To address this, we propose the term “tip fog” to describe the variance between the value from the overall modeled evolutionary process and what is recorded, without implying a specific mechanism. We show why accounting for tip fog remains critical by showing its impact on continuous comparative models and discrete comparative and diversification models. We rederive methods to estimate this variance and use simulations to assess its feasibility and importance in a comparative context. Our findings reveal that ignoring tip fog substantially affects the accuracy of rate estimates, with higher tip fog levels showing greater biases from the true rates, as well as affecting which models are chosen. The findings underscore the importance of model selection and the potential consequences of neglecting tip fog, providing insights for improving the accuracy of comparative methods in evolutionary biology.
Data from: Navigating “tip fog”: Embracing uncertainty in tip measurements
“Tip fog” is our term for what used to be called measurement error. Take a concrete example: squid tentacle length, for which we want to use the average value per species.
There are factors that lead to the given value (10.42 cm) for a species differing from the “truth”
- Sample variation: we measured only three squids in a species, not every individual.
- Measurement uncertainty: even if we measured every squid, tentacles are stretchy, so we can’t know its true length to the micron.
- Pure error: someone measured in inches not centimeters for one of the three measured specimens
- Intraspecific variation: individuals differ in size
In past code, we’ve called this all “measurement error” (or its less melodious name, mserr
) but they’re distinct factors. One could call them “noise” but there’s often a contrast between “noise” and “signal” but some of these have signal for important processes (especially things like intraspecific variation). Thus we needed a new term. “tip fog” it is.
Code:
Each folder contains code used to conduct all analyses presented in the paper. All that is required is that R is installed.
corHMM_sims
This directory provides the code used to conduct and summarize our simulations and analysis of discrete character evolution with and without tip fog. The R packages corHMM
, TreeSim
, parallel
, magittr
, tidyverse
, and viridis
are required to run these scripts.
- sim.tree.Rsave – provides the tree used to generate patterns of discrete evolution
- CorHMMFogSim.R – is the script used to run the simulations. This script will generate files with a unique prefix. For example,
CorHMMSim_0.15.rep1.Rsave
indicates that a tip fog value of 15% was used and this was the first replicate. All simulation files used in our paper are provided in thesimReps
directory. - simSumCorHMMFog.R – is the script used to summarize the simulations in various ways. The files
CorHMM_sim.raw_results.Rsave
andCorHMM_simARD.raw_results.Rsave
were generated from this script.
HiSSE_sims
This directory provides the code used to conduct and summarize our simulations and analysis of trait dependent diversification with and without tip fog. The R packages hisse
, diversitree
, parallel
, magittr
, tidyverse
, and viridis
are required to run these scripts.
- sim.tree2.hisse.Rsave – the set of trees used to generate patterns of character dependent diversification.
- HiSSEFogSim.R – is the script used to run the simulations. This script will generate files with a unique prefix. For example,
HiSSESim_0.22.042.0.1.1.Rsave
indicates that a speciation rate of 0.22 for state 0, a speciation rate of 0.42, tip fog value of 10% was used, and this was the first replicate. All simulation files used in our paper are provided in thesimReps
directory. - simSumHiSSEFog.R – is the script used to summarize the simulations in various ways. The file
HiSSE_sim.raw_results.Rsave
was generated from this script.
OUwie_sims
This directory provides the code used to conduct and summarize our simulations and analysis of continuous trait evolution with and without tip fog. The R packages OUwie
, TreeSim
, corHMM
, parallel
, magittr
, tidyverse
, dentist
, and viridis
are required to run these scripts.
- sim.tree.dat.base.Rsave – the tree that contains the regime mapping used to simulation a continuous character.
- sim.tree.dat.base.means.Rsave – a pectinate trees that contains the regime mapping used to simulation a continuous character assuming only differences in the optima among regimes.
- OUFogSim3P_take2.R – is the script used to run the simulations and analyze them. This script will generate files with a unique prefix. For example,
BMS_3P_sim.0.1_1.25_1.rep1.Rsave
indicates that a Brownian motion model assuming different rates per regime, the three-point algorithm was used to estimates model parameters, a tip fog value of 10% was used, and 1.25x difference in rate between regime 1 and 2 was used for the first replicate. All simulation files used in our paper are provided in thesimReps
directory. - simSumOUwieFog.R – is the script used to summarize the simulations in various ways. The files
BMS_sim.raw_results_updated.Rsave
,OUM_sim.raw_results_updated.Rsave
, andOUMV_sim.raw_results_updated.Rsave
were generated from this script. We also used this script to generate two likelihood surface analyses using the packagedentist
:OUM_f01.Rsave
andOUM_f20.Rsave
.