Adaptation by natural selection depends on the rates, effects and interactions of many mutations, making it difficult to determine what proportion of mutations in an evolving lineage are beneficial. Here we analysed 264 complete genomes from 12 Escherichia coli populations to characterize their dynamics over 50,000 generations. The populations that retained the ancestral mutation rate support a model in which most fixed mutations are beneficial, the fraction of beneficial mutations declines as fitness rises, and neutral mutations accumulate at a constant rate. We also compared these populations to mutation-accumulation lines evolved under a bottlenecking regime that minimizes selection. Nonsynonymous mutations, intergenic mutations, insertions and deletions are overrepresented in the long-term populations, further supporting the inference that most mutations that reached high frequency were favoured by selection. These results illuminate the shifting balance of forces that govern genome evolution in populations adapting to a new environment.
REL606, ancestral genome
This files contains the ancestral genome in gff3 format
REL606.gff3
REL606 non covered regions
This file contains the positons in the ancestral genome that are not covered by the short read sequencing technology used due to repeats.
REL606.L20.G15.P0.M35.mask.gd
LTEE mutator status file
This files provides information about the point mutator status of each clone. Clones that have an increased point mutation rate have a value of 1 in column 5.
LTEE_Mutator_info.txt
LTEE mutation file
This file stores all mutations that were identified in the sequenced genomes of the LTEE, expect the ones in masked regions or in the neighbourhood of IS elements. It is used to compute most mutation analysis.
oli.LTEE.final_masked.no_IS_adjacent.tab
MAE mutation file
This file stores all mutations that were identified in the sequenced genomes of the MAE (Mutation Accumulation Experiment), expect the ones in masked regions or in the neighbourhood of IS elements.
oli.MAE.final_masked.no_IS_adjacent.tab
MAE fitness
This files contains the fitness estimates of the MAE clones sequenced.
MAE_fitness.csv
LTEE spectrum counts
This file describes the spectrum of mutations that have accumulated through time in each population of the LTEE
spectrum_counts.csv
LTEE mutation counts (1)
This file describes, for each sequenced clone of the LTEE, the counts of each mutation types recovered. It is used for the genome size figure.
count.LTEE.final_masked.csv
LTEE mutation counts (2)
This file describes, for each sequenced clone of the LTEE, the counts of each mutation types recovered. It is produced by the program ComputeMutationThroughTimeDryad.pl and is used to produce most figures.
MutationTypesThroughTime.txt
MAE mutation counts
This file describes, for each sequenced clone of the MAE, the counts of each mutation types recovered. It is produced by the program ComputeMutationThroughTimeDryad.pl and is used to produce figures using the MAE as a reference.
MutationTypesThroughTimeMAE.txt
LTEE point mutation matrix for phylogeny
This file contains an array in which the presence and absence of point mutations recovered in the LTEE sequenced clones is stored as 0 or 1 respectively for each clone. It is produced by ComputeMutationThroughTimeDryad.pl and is used to produce figure 2.
MutRArray.txt
REL606 ancestral genome composition
This file, produced by GenomeCompositionComputer.pl, reports the composition of the ancestral genome (synonymous, nonsynonymous, intergenic) for each type of point mutations (AT to CG, AT to GC ...). It is used to compute the expected number of nonsynonymous mutations based on the observed number of synonymous.
GenomeComposition.txt
ComputeMutation command file LTEE
This file is the command file used by ComputeMutationThroughTimeDryad.pl to compute the mutation counts, the mutation matrix. This version is asking for all mutations of the LTEE to be reported.
CmdfileLTEE.txt
ComputeMutation command file MAE
This file is the command file used by ComputeMutationThroughTimeDryad.pl to compute the mutation counts, the mutation matrix. This version is asking for all mutations of the MAE to be reported.
CmdfileMAE.txt
ComputeMutation command file LTEE mutator
This file is the command file used by ComputeMutationThroughTimeDryad.pl to compute the mutation counts, the mutation matrix. This version is asking to report only mutations occurring in point mutator clones of the LTEE
CmdfileLTEE_Mutator.txt
ComputeMutation command file LTEE non mutator
This file is the command file used by ComputeMutationThroughTimeDryad.pl to compute the mutation counts, the mutation matrix. This version is asking to report only mutations occurring in non point mutator clones of the LTEE
CmdfileLTEE_nonMutator.txt
Compute genome composition perl script
This perl script computes the GenomeComposition.txt file that reports the number of synonymous, nonsynonymous and intergenic mutations for each of the 6 types of possible point mutations. See ReadMe.pdf for usage.
GenomeCompositionComputer.pl
Figures R script
This R script is used to create all figures and extended data figures. See ReadMe.pdf for usage.
ScriptFiguresDryad.r
Gstat perl script
This perl script computes the G stastics of convergence and the randomisation of point mutations used to estimate the p-value of the G estimated. See ReadMe.pdf for usage.
ConvergenceGstatDryad.pl
MAE mutator status file
This files provides information about the point mutator status of each clone of the MAE. Clones that have an increased point mutation rate have a value of 1 in column 5.
MAE_Mutator_info.txt
Compute mutation through time perl script
This perl script computes the spectrum of mutations found in each clone as well as the matrix used for the phylogeny. See ReadMe.pdf for usage.
ComputeMutationThroughTimeDryad.pl
ReadMe file
This file explains how to use the 4 scripts in order to produce all figures and G scores of the article