Data from: Efficient inference of recombination hot regions in bacterial genomes

Yahara, Koji1; Didelot, Xavier2; Ansari, M Azim.3; Sheppard, Samuel K.4; Falush, Daniel5

Published Mar 04, 2014 on Dryad. https://doi.org/10.5061/dryad.kp14p

Data files

Mar 04, 2014 version files 6.14 MB

jejuni200_data.zip

6.14 MB

Abstract

In eukaryotes, detailed surveys of recombination rates have shown variation at multiple genomic scales and the presence of “hotspots” of highly elevated recombination. In bacteria, studies of recombination rate variation are less developed, in part because there are few analysis methods that take into account the clonal context within which bacterial evolution occurs. Here we focus in particular on identifying “hot regions” of the genome where DNA is transferred frequently between isolates. We present a computationally efficient algorithm based on the recently developed "chromosome painting" algorithm, which characterizes patterns of haplotype sharing across a genome. We compare the average genome wide painting, which principally reflects clonal descent, with the painting for each site which additionally reflects the specific deviations at the site due to recombination. Using simulated data, we show that hot regions have consistently higher deviations from the genome wide average than normal regions. We applied our approach to previously analysed Escherichia coli genomes, and revealed that the new method is highly correlated with the number of recombination events affecting each site inferred by ClonalOrigin, a method that is only applicable to small numbers of genomes. Furthermore, we analysed recombination hot regions in Campylobacter jejuni by using 200 genomes. We identified three recombination hot regions which are enriched for genes related to membrane proteins. Our approach and its implementation, which is downloadable from https://github.com/bioprojects/orderedPainting, will help to develop a new phase of population genomic studies of recombination in prokaryotes.

Data from: Efficient inference of recombination hot regions in bacterial genomes

Data files

Abstract

Usage notes

Allele sequences of the 200 C. jejuni isolates

Works referencing this dataset