Domain organization of lentiviral and betaretroviral surface envelope glycoproteins modeled with AlphaFold
Data files
Nov 09, 2021 version files 2.72 MB
-
CAEV-63_gp135.pdb
-
HERV-K_SU.pdb
-
JRSV_SU.pdb
-
MMTV_SU.pdb
-
README.txt
-
Visna_KV1772_gp135.pdb
Abstract
The surface envelope glycoproteins of non-primate lentiviruses and betaretroviruses share sequence similarity with the inner proximal domain b-sandwich of the human immunodeficiency virus type 1 (HIV-1) gp120 glycoprotein that faces the transmembrane glycoprotein as well as patterns of cysteine and glycosylation site distribution that points to a similar two-domain organization in at least some lentiviruses. Here, high reliability models of the surface glycoproteins obtained with the AlphaFold algorithm are presented for the gp135 glycoprotein of the small ruminant caprine arthritis-encephalitis (CAEV) and visna lentiviruses and the betaretroviruses jaagsiekte sheep retrovirus (JSRV), mouse mammary tumor virus (MMTV) and consensus human endogenous retrovirus type K (HERV-K). The models confirm and extend the inner domain structural conservation in these viruses and identify two outer domains with a putative receptor binding site in the CAEV and visna virus gp135. The location of that site is consistent with patterns of sequence conservation and glycosylation site distribution in gp135. In contrast, a single domain is modeled for the JSRV, MMTV and HERV-K betaretrovirus envelope proteins that is highly conserved structurally in the proximal region and structurally diverse in apical regions likely to interact with cell receptors. The models presented here identify sites in small ruminant lentivirus and betaretrovirus envelope glycoproteins likely to be critical for virus entry and virus neutralization by antibodies and will facilitate their functional and structural characterization.
Methods
The mature region of the surface envelope glycoproteins of non-primate lentiviruses and betaretroviruses were modeled using a simplified AlphaFold 2 method without templates, accessible with graphics processing unit support through the Colab server at https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb. GenBank accession number for the sequences used in modeling of CAEV-CO, CAEV-63, visna virus KV1772 gp135 and visna virus K1514 gp135 and FIV, EIAV, BIV and JDV, MMTV and JSRV surface envelope glycoproteins are M60855, L06906, U21603, X01811 and M80216.
Usage notes
B-factors column (column 11) indicate pLDDT reliability scores for structural predictions.