Skip to main content
Dryad logo

Data from: Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence

Citation

Fukushima, Kenji; Pollock, David (2022), Data from: Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence, Dryad, Dataset, https://doi.org/10.5061/dryad.tx95x6b0v

Abstract

On macroevolutionary timescales, extensive mutations and phylogenetic uncertainty mask the signals of genotype-phenotype associations underlying convergent evolution. To overcome this problem, we extended the widely used framework of nonsynonymous-to-synonymous substitution rate ratios and developed the novel metric ωC, which measures the error-corrected convergence rate of protein evolution. While ωC distinguishes natural selection from genetic noise and phylogenetic errors in simulation and real examples, its accuracy allows an exploratory genome-wide search of adaptive molecular convergence without phenotypic hypothesis or candidate genes. Using gene expression data, we explored over 20 million branch combinations in vertebrate genes and identified the joint convergence of expression patterns and protein sequences with amino acid substitutions in functionally important sites, providing hypotheses on undiscovered phenotypes. We further extended our method with a heuristic algorithm to detect highly repetitive convergence among computationally nontrivial higher-order phylogenetic combinations. Our approach allows bidirectional searches for genotype-phenotype associations, even in lineages that diverged for hundreds of millions of years.

Funding

Japan Society for the Promotion of Science, Award: 18J00178

Alexander von Humboldt-Stiftung, Award: Sofja Kovalevskaja programme

Human Frontier Science Program, Award: RGY0082/2021

National Institutes of Health, Award: GM083127