Skip to main content

Data from: The landscape of realized homologous recombination in pathogenic bacteria

Cite this dataset

Yahara, Koji et al. (2016). Data from: The landscape of realized homologous recombination in pathogenic bacteria [Dataset]. Dryad.


Recombination enhances the adaptive potential of organisms by allowing genetic variants to be tested on multiple genomic backgrounds. Its distribution in the genome can provide insight into the evolutionary forces that underlie traits such as the emergence of pathogenicity. Here we examined landscapes of realized homologous recombination of 500 genomes from ten bacterial species, and found all species have ‘hot’ regions with elevated rates relative to the genome average. We examined the size, gene content and chromosomal features associated with these regions and the correlations between closely related species. The recombination landscape is variable and evolves rapidly. For example in Salmonella, only short regions of around 1kb in length are hot while in the closely related species Escherichia coli, some hot regions exceed 100kb, spanning many genes. Only Streptococcus pyogenes shows evidence for the positive correlation between GC content and recombination that has been reported for several eukaryotes. Genes with function related to the cell surface/membrane are often found in recombination hot regions but E. coli is the only species where genes annotated as “virulence associated” are consistently hotter. There is also evidence that some genes with “housekeeping” functions tend to be overrepresented in cold regions. For example, ribosomal proteins showed low recombination in all of the species. Among specific genes, transferrin binding proteins are recombination hot in all three of the species in which they were found, and are subject to inter-species recombination.

Usage notes