Skip to main content
Dryad

Data from: From β- to α-proteobacteria: the origin and evolution of rhizobial nodulation genes nodIJ

Abstract

Although many α- and some β-proteobacterial species are symbiotic with legumes, the evolutionary origin of nitrogen-fixing nodulation remains unclear. We examined α- and β-proteobacteria whose genomes were sequenced using large-scale phylogenetic profiling and revealed the evolutionary origin of two nodulation genes. These genes, nodI and nodJ (nodIJ), play key roles in the secretion of Nod factors, which are recognized by legumes during nodulation. We found that only the nodulating β-proteobacteria, including the novel strains isolated in this study, possess both nodIJ and their paralogous genes (DRA-ATPase/permease genes). Contrary to the widely accepted scenario of the a-proteobacterial origin of rhizobia, our exhaustive phylogenetic analysis showed that the entire nodIJ clade is included in the clade of Burkholderiaceae DRA-ATPase/permease genes, i.e., the nodIJ genes originated from gene duplication in a lineage of the β-proteobacterial family. After duplication, the evolutionary rates of nodIJ were significantly accelerated relative to those of homologous genes, which is consistent with their novel function in nodulation. The likelihood analyses suggest that this accelerated evolution is not associated with changes in either nonsynonymous/synonymous substitution rates or transition/transversion rates, but rather, in the GC content. Although the low GC content of the nodulation genes has been assumed to reflect past horizontal transfer events from donor rhizobial genomes with low GC content, no rhizobial genome with such low GC content has yet been found. Our results encourage a reconsideration of the origin of nodulation and suggest new perspectives on the role of the GC content of bacterial genes in functional adaptation.