Skip to main content
Dryad

The Dayhoff Exchange Score: A new metric to quantify site saturation in amino acid datasets prior to phylogenetic analysis

Data files

Aug 12, 2024 version files 8.82 GB
Dec 17, 2025 version files 14.52 GB
Dec 22, 2025 version files 14.52 GB

Click names to download individual files Select up to 11 GB of files for zip download

Abstract

Entropic site saturation is a persistent problem in phylogenetic analyses, where it can hinder the accuracy of topology reconstruction. It is fundamentally caused by large amounts of independent change along branches, causing the model to be unable to distinguish phylogenetic signal from noise. The Dayhoff Exchange Score (DE-score) is a new metric to assess this form of site saturation within and between amino acid datasets, which provides both a whole dataset overview and taxon-specific values that represent the contribution of a given taxon to the whole dataset entropic site saturation. We first assess the efficacy of this score at detecting increased entropic site saturation over 20,000 simulation datasets, compare it to the existing Slope R2 score, and then assess its efficacy in the face of the potentially confounding factors of increasing taxon number, number of positions in the alignment, missing data, and noise. Finally, we use the DE-Score to re-evaluate several previously published datasets to illustrate its efficacy.