Data from: Evaluating the accuracy of methods for detecting correlated rates of molecular and morphological evolution
Data files
Sep 28, 2023 version files 1.36 GB
-
README.md
-
README.txt
-
Supplementary_Data.zip
Abstract
Determining the link between genomic and phenotypic change is a fundamental goal in evolutionary biology. Insights into this link can be gained by using a phylogenetic approach to test for correlations between rates of molecular and morphological evolution. However, there has been persistent uncertainty about the relationship between these rates, partly because conflicting results have been obtained using various methods that have not been examined in detail. We carried out a simulation study to evaluate the performance of five statistical methods for detecting correlated rates of evolution. Our simulations explored the evolution of molecular sequences and morphological characters under a range of conditions. Of the methods tested, Bayesian relaxed-clock estimation of branch rates was able to detect correlated rates of evolution correctly in the largest number of cases. This was followed by correlations of root-to-tip distances, Bayesian model selection, independent sister-pairs contrasts, and likelihood-based model selection. As expected, the power to detect correlated rates increased with the amount of data, both in terms of tree size and number of morphological characters. Likewise, greater among-lineage rate variation in the data led to improved performance of all five methods, particularly for Bayesian relaxed-clock analysis when the rate model was mismatched. We then applied these methods to a data set from flowering plants and did not find evidence of a correlation in evolutionary rates between genomic data and morphological characters. The results of our study have practical implications for phylogenetic analyses of combined molecular and morphological data sets, and highlight the conditions under which the links between genomic and phenotypic rates of evolution can be evaluated quantitatively.
README
https://doi.org/10.5061/dryad.7wm37pw01
Description of the data and file structure
Supplementary data and text for "Evaluating the Accuracy of Methods for Detecting Correlated Rates of Molecular and Morphological Evolution".
Please refer to "Supplementary_Files_Directory.pdf" for a detailed guide of the folder structure.
Code/Software
We used RStudio (version 4.0.0) to execute the scripts listed under "Generate Phylograms" in the R Scripts for methods folder. This requires the use of R packages NELSI (version 0.21); ape (version 5.7.1); phangorn (version 2.11.1); geiger (version 2.0.7); and phylotools (version 0.2.2). This script required a chronogram to generate the molecular and morphological synthetic data sets used in this study. The R script generates a morphological and molecular tree file (phylogram), and a morphological character matrix for 20 replicates. A molecular sequence alignment was then be generated in Seq-Gen (version 1.3.4). The resultant morphological character matrix and molecular sequence alignment were then analysed in IQ-TREE (version 2.0.6). The data were also used in BEAST2 to generate XML files (using version 2.6.2) and analysed using BEAST2 (version 2.6.6). The MODEL_SELECTION (version 1.5.3) and MM (morph-models) (version 1.1.1) packages were loaded in BEAST2 for generating XML files and analysing the data.
The methods for detecting correlated rates of evolution were implemented in RStudio, requiring the use of the following packages: adephylo (version 1.1.13); phylobase (version 0.8.10); jmuOutlier (version 2.2); diverge (version 2.0.4); data.table (version 1.14.6); tidyverse (version 1.3.2); readtext (version 0.81); treeio (version 1.22.0); dplyr (version 1.1.2). The R scripts are included in the supplementary data along with the results from all analyses.