Skip to main content

Data from: Rapid divergence of a gamete recognition gene promoted macroevolution of Eutheria

Cite this dataset

Roberts, Emma (2023). Data from: Rapid divergence of a gamete recognition gene promoted macroevolution of Eutheria [Dataset]. Dryad.


Speciation genes contribute disproportionately to species divergence, but few examples exist, especially in vertebrates. In mammals, the Zan gene encodes the sperm acrosomal protein zonadhesin that mediates species-specific adhesion to the egg’s zona pellucida. Here we identify Zan as a speciation gene in placental mammals. Genomic ontogeny revealed that Zan arose by repurposing of a stem vertebrate gene that was lost in multiple lineages but retained in Eutheria on acquiring a function in egg recognition. A 112-species Zan sequence phylogeny, representing 17 of 19 placental Orders, resolved all species into monophyletic groups corresponding to recognized Orders and Suborders, with <5% unsupported nodes. Three other rapidly evolving germ cell genes (Adam2, Zp2, and Prm1), a paralogous somatic cell gene (TectA), and a mitochondrial gene commonly used for phylogenetic analyses (Cytb) all yielded trees with poorer resolution than the Zan tree and inferior topologies relative to a widely accepted mammalian supertree. Zan divergence by intense positive selection and domain duplications and accelerated divergence rates in the Myomorpha Suborder of Rodentia produced dramatic species differences in the protein’s properties, with ordinal divergence rates generally reflecting species-richness of placental Orders consistent with expectations for a speciation gene that acts across a wide range of taxa. Furthermore, Zan’s combined phylogenetic utility and divergence exceeded those of all other genes known to have evolved in Eutheria by positive selection, including the only other speciation gene, Prdm9. We conclude that species-specific egg recognition conferred by Zan’s functional divergence served as a mode of prezygotic reproductive isolation that promoted the extraordinary adaptive radiation and success of Eutheria.


We aligned authentic Zan nucleotide sequences encoding the zonadhesin protein's von Willebrand D0, D1, D2, D3, and approximately the first 25% of D4 domains (range: 330–1560 nts each) using T-coffee software in Meta-coffee mode. To confirm correct reading frames and detect premature stop codons, we translated the aligned sequences in MEGA X. We examined 88 maximum likelihood models with the hierarchical likelihood ratio tests with Akaike Information Criterion-correction in jModelTest2.1.10 to detect the best-fit model of nucleotide substitution, and identified GTR+G+I as the most appropriate model. We selected a Zan-like gene from Chinese soft-shelled turtle (Pelodiscus sinicus) as outgroup in the Zan alignment. To perform likelihood analysis under a Bayesian inference model, we used MrBayes 3.2.6 with the following options: 2 independent runs with four chains, one cold and three heated (Metropolis-coupled Markov chain Monte Carlo numerical method), 10 million generations, and sample frequency every 100th generation from the last 750,000 generated, then constructed a consensus tree (50% majority rule) from the remaining trees and plotted posterior probability values on the topology in FigTree 1.4.4. The above methodology was applied to analyze an additional 35 rapidly-evolving genes as reported from the literature.

Usage notes

T-coffee, MEGA X, jModelTest2.1.10, MrBayes 3.2.6, and Microsoft Office are all software programs required to open the data files uploaded.


American Society of Mammalogists

Oklahoma Biological Survey

Texas Academy of Science

Texas Tech University