Do the amino acid sequence identities of residues that make contact across protein interfaces covary during evolution? If so, such covariance could be used to predict contacts across interfaces and assemble models of biological complexes. We find that residue pairs identified using a pseudo-likelihood based method to covary across protein-protein interfaces in the 50S ribosomal unit and 28 additional bacterial protein complexes with known structure are almost always in contact in the complex provided that the number of aligned sequences is greater than the average of the lengths of the two proteins. We use this method to make subunit contact predictions for an additional 36 protein complexes with unknown structures, and present models based on these predictions for the tripartite ATP-independent periplasmic (TRAP) transporter, the tripartite efflux system, the pyruvate formate lyase-activating enzyme complex, and the methionine ABC transporter.
PDB file for TRAP complex
The predicted complex of the tripartite ATP-independent periplasmic (TRAP) transporter. It is composed of three proteins: two integral membrane proteins YIAM and YIAN, and one periplasmic protein YIAO. The PDB file was sampled using UniProt E. coli sequences of YIAM_ECOLI, YIAN_ECOLI and YIAO_ECOLI.
YIA_MNO.pdb
PDB file for D-methionine transport system
Predicted complex of the D-methionine transporter. It is an ATP-driven transport system that transports methionine. The PDB file was sampled using UniProt E. coli sequences of METI_ECOLI and METQ_ECOLI.
METI_METQ.pdb
PDB file for Tripartite efflux system
Predicted complex for the Tripartite efflux pump. This family of complexes spans both the inner and outer membrane, and are widely used in bacteria to pump toxic compounds out of the cell. The PDB file was modeled using E. coli sequences of MDTP_ECOLI and MDTN_ECOLI.
MDTP_MDTN.pdb
PDB file for PFL Complex
Predicted complex of Pyruvate formate lyase-activating enzyme. Pyruvate formate-lyase (PFL) catalyzes the reaction of acetyl-CoA and formate from pyruvate and CoA in the Fermentation pathway. The PDB file was sampled using the E. coli sequences of PFLA_ECOLI and PFLB_ECOLI.
PFLA_PFLB.pdb
Ribosome 50S alignments
Multiple sequence alignments used in the Ribosome 50S analysis. Each fasta file begins with the T. thermophilus amino acid sequence pair, with other homologous sequences trimmed to match its length. For an interactive list, please visit: http://gremlin.bakerlab.org/cplx.php?mode=ribo
ribosome_50S_alignments.zip
E. coli paired alignments
Multiple sequence alignments used in E. coli gene pair analysis. Each fasta file begins with the E. coli amino acid sequence pair, with other homologous sequences trimmed to match its length. For an interactive list, please visit: http://gremlin.bakerlab.org/cplx.php?mode=ecoli
ecoli_paired_alignments.zip
PDB benchmark alignments
Multiple sequence alignments used for the PDB benchmark set. Each fasta file begins with the PDB amino acid sequence pair, with other homologous sequences trimmed to match its length. For an interactive list, please visit: http://gremlin.bakerlab.org/cplx.php?mode=pdb
PDB_benchmark_alignments.zip
PDB benchmark alignments NDX
Multiple sequence alignments used for the PDB benchmark set. This set is restricted to the NADH dehydrogenase complex. Each fasta file begins with the T. thermophilus amino acid sequence pair, with other homologous sequences trimmed to match its length. For an interactive list, please visit: http://gremlin.bakerlab.org/cplx.php?mode=ndx
PDB_benchmark_alignments_NDX.zip