Skip to main content
Dryad

Data from: Differentiating founder and chronic HIV envelope sequences

Cite this dataset

Murray, John M. et al. (2018). Data from: Differentiating founder and chronic HIV envelope sequences [Dataset]. Dryad. https://doi.org/10.5061/dryad.r19c2

Abstract

Significant progress has been made in characterizing broadly neutralizing antibodies against the HIV envelope glycoprotein Env, but an effective vaccine has proven elusive. Vaccine development would be facilitated if common features of early founder virus required for transmission could be identified. Here we employ a combination of bioinformatic and operations research methods to determine the most prevalent features that distinguish 78 subtype B and 55 subtype C founder Env sequences from an equal number of chronic sequences. There were a number of equivalent optimal networks (based on the fewest covarying amino acid (AA) pairs or a measure of maximal covariance) that separated founders from chronics: 13 pairs for subtype B and 75 for subtype C. Every subtype B optimal solution contained the founder pairs 178–346 Asn-Val, 232–236 Thr-Ser, 240–340 Lys-Lys, 279–315 Asp-Lys, 291–792 Ala-Ile, 322–347 Asp-Thr, 535–620 Leu-Asp, 742–837 Arg-Phe, and 750–836 Asp-Ile; the most common optimal pairs for subtype C were 644–781 Lys-Ala (74 of 75 networks), 133–287 Ala-Gln (73/75) and 307–337 Ile-Gln (73/75). No pair was present in all optimal subtype C solutions highlighting the difficulty in targeting transmission with a single vaccine strain. Relative to the size of its domain (0.35% of Env), the α4β7 binding site occurred most frequently among optimal pairs, especially for subtype C: 4.2% of optimal pairs (1.2% for subtype B). Early sequences from 5 subtype B pre-seroconverters each exhibited at least one clone containing an optimal feature 553–624 (Ser-Asn), 724–747 (Arg-Arg), or 46–293 (Arg-Glu).

Usage notes