Data from: Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence
Fukushima, Kenji; Pollock, David (2022), Data from: Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence, Dryad, Dataset, https://doi.org/10.5061/dryad.tx95x6b0v
On macroevolutionary timescales, extensive mutations and phylogenetic uncertainty mask the signals of genotype-phenotype associations underlying convergent evolution. To overcome this problem, we extended the widely used framework of nonsynonymous-to-synonymous substitution rate ratios and developed the novel metric ωC, which measures the error-corrected convergence rate of protein evolution. While ωC distinguishes natural selection from genetic noise and phylogenetic errors in simulation and real examples, its accuracy allows an exploratory genome-wide search of adaptive molecular convergence without phenotypic hypothesis or candidate genes. Using gene expression data, we explored over 20 million branch combinations in vertebrate genes and identified the joint convergence of expression patterns and protein sequences with amino acid substitutions in functionally important sites, providing hypotheses on undiscovered phenotypes. We further extended our method with a heuristic algorithm to detect highly repetitive convergence among computationally nontrivial higher-order phylogenetic combinations. Our approach allows bidirectional searches for genotype-phenotype associations, even in lineages that diverged for hundreds of millions of years.
Japan Society for the Promotion of Science, Award: 18J00178
Alexander von Humboldt-Stiftung, Award: Sofja Kovalevskaja programme
Human Frontier Science Program, Award: RGY0082/2021
National Institute of General Medical Sciences, Award: GM083127