Proteobacteria species tree in "Evolutionary origins and diversification of proteobacterial mutualists" Protein sequences for phylogenetic reconstruction were selected based on conservation and lack of horizontal transfer among taxa. The following gene sequences were selected: dnaG, frr, gcp, infC, leuS, nusA, pgk, pheS, pyrG, rplA, rplB, rplC, rplD, rplE, rplF, rplK, rplL, rplM, rplN, rplO, rplP, rplR, rplS, rplT, rplV, rpmA, rpoA, rpoB, rpsB, rpsC, rpsD, rpsE, rpsG, rpsH, rpsI, rpsJ, rpsK, rpsL, rpsM, rpsO, rpsQ, rpsS, secY, serS, smpB, tsf, and ychF. Orthologs were identified through complementary methods. First, we searched for proteins annotated in the [Kyoto Encyclopedia of Genes and Genomes Orthology](http://www.genome.jp/kegg/ko.html) database. For organisms in our study that did not have entries in the entire KEGG database (e.g. multiple strains of the same species), we identified orthologs through searches of the NCBI Protein database. These searches were supplemented with reciprocal BLASTs, using proteins sequences from closely related organisms as the initial queries. This was intended to distinguish between multiple annotations of the same gene in an organism and also to confirm the orthologous relationship of the proteins. Orthologous sequences were downloaded from the Batch-Entrez website and aligned using default settings on MUSCLE. Alignments were concatenated using the BioPerl script concat_aln and were trimmed with the program trimAl using the "strict" setting, resulting in a concatenation of 7,828 amino acids. Missing proteins were represented by gaps. MCMC phylogenies were reconstructed with MrBayes 3.2-cvs using a fixed rate model of evolution selected by an MCMC sampler that explored multiple models. Three MCMC runs of 106 generations each converged on a stationary distribution (average standard deviation of split frequencies < 0.01). One tree out of 100 for each of the final 24,100 generations (post-burnin) was sampled in each run and a majority-rules consensus tree was generated from this pool of 723 trees. The majority consensus tree is shown in this file.