Polymerase trapping as mechanism of H5 highly pathogenic avian influenza virus genesis
Data files
Dec 22, 2025 version files 6.80 GB
-
fastq.zip
6.80 GB
-
README.md
5.06 KB
-
sample_information.tsv
24.43 KB
Abstract
Highly pathogenic avian influenza viruses (HPAIVs) derive from H5 and H7 low pathogenic avian influenza viruses (LPAIVs). Although insertion of a furin-cleavable multibasic cleavage site (MBCS) in the hemagglutinin gene was identified decades ago as the genetic basis for the LPAIV-to-HPAIV transition, the mechanisms underlying the occurrence of insertion are unknown. Here, we show that transient H5 RNA structures predicted to trap the influenza virus polymerase on purine-rich sequences drive nucleotide insertions, providing the first strong empirical evidence of RNA structure involvement in MBCS acquisition. Introduction of H5-like sequences and structures into an H6 hemagglutinin resulted in MBCS-yielding insertions. Our results show that nucleotide insertions that underlie H5 HPAIV emergence result from a previously unknown RNA-structure-driven diversity-generating mechanism, which could be shared with other RNA viruses.
Authors: Mathis Funk, Monique I. Spronken, Roy M. Hutchinson, Benoit Arragain, Pauline Juyoux, Theo M. Bestebroer, Anja C.M. de Bruin, Alexander P. Gultyaev, Ron A.M. Fouchier, Stephen Cusack, Aartjan J.W. te Velthuis, Mathilde Richard
This README file describes the data package accompanying the above publication.
Context of the study
Highly pathogenic avian influenza viruses (HPAIVs) cause severe disease and high fatality in poultry. HPAIVs spontaneously emerge from low pathogenic avian influenza viruses (LPAIV). Although insertion of a furin-cleavable multibasic cleavage site in the hemagglutinin (HA) gene was identified decades ago as the genetic basis for LPAIV-to-HPAIV transition, the exact mechanisms underlying said insertion have remained unknown.
We have suggested previously that these insertions are driven by transient RNA template structures forming around the influenza polymerase as it replicates this region. Here we performed circular sequencing of the cleavage site in wild-type HAs and HAs in which putative polymerase-trapping RNA structures were altered by nucleotide subsitutions in the vicinity of the cleavage site.
Experimental design
A virus-free influenza replication system was used to detect atons without constraints at the protein level or selection biases. Mammalian cells (293T cells) or avian cells (DF1 cells) were transfected with plasmids coding for the influenza virus RNA-dependent RNA poRNA-dependentp), the nucleoprotein, and the hemagglutinin, in viral RNA (vRNA) or complementary RNA (cRNA) to reconstitute viral ribonucleoproteins, the replication unit of influenza viruses. Two days after transfection, cells were lysed, and total RNA was extracted. The hemagglutinin cleavage site was sequenced using a circular next-generation sequencing approach to accurately discriminate insertions by the influenza RdRp from background insertions due to plasmid transcription by the human polymerase I, allowing detection of rare insertions with high confidence. Briefly, total RNA was circularized, and DNA was produced through reverse transcription and second synthesis. Second-strand cons were then generated with polymerase chain reaction and sequenced using the Illumina plateform.
Type of splatformtant hemagglutinins with modified RNA structures or sequences were designed using an in-house bioinformatic tool bioinformaticnsient RNA structures, and their impact on insertion pattern and frequency was assessed. Each hemagglutinin construct was tested in two independent transfection experiments (replicate experiments 1 and 2).
Hemagglutinins of three avian influenza viruses were studied: IN (A/Indonesia/5/2005, H5N1), NL (A/mallard/Netherlands/3/1999, H5N1) and SW (A/mallard/S, Sweden/81/2002, H6N1).
Hemagglutinin cleavage sites were mutated from the low pathogenic avian influenza virus H5 or H6 consensus sequence (RETR or IETR, respectively) to REKR or RKKR.
In addition, substitutions were introduced in parts flanking the hemagglutinin cleavage site to either disrupt or stabilize predicted RNA structures.
To understand whether insertion patterns insertionfic to the hemagglutinin cleavage site, two other regions in the hemagglutinin were studied in the NL hemagglutinin (NL-556 and NL-1275).
To confirm the role of a putative transient cRNA structure, unidirectional replication experiments were performed, in which either cRNA or vRNA were produced by the wasluenza virus RdRp (IN-cv (cRNA to vRNA) or IN-vc (vRNA to cRNA)).
Each sample was paired with a negative control to assess background insertions introduced during initial plasmid transcription by the human polymerase 1. In these samples, cells were transfected only with plasmids coding for the hemagglutinin v or cRNA and the nucleoprotein (np samples, without influenza virus RdRp).
Type of data
Uploaded reads are single merged reads after paired-end sequencing. Reads were merged using Adapter Removal (v. 2.3.2) and the following command:
AdapterRemoval --file1 reads_1.fq --file2 reads_2.fq --basename output_paired --collapse --threads
Merged reads were further unprocessed beyond the removal of contaminating sequences as stated in the Methods section of the associated article.
Organization of the data
Reads are in the "fastq.zip" folder.
Metadata is found in the file named "sample_information.tsv".
The .tsv file is organized as follows:
- Column 1: sample number, corresponding to the number indicated at the beginning of each fastq file for sample identification, organized under categories corresponding to presentation of the data in the main text or supplementary data of the article.
- Column 2: sample name, corresponding to the name of each sample as described in the article.
- Column 3: next-generation sequencing plateform used to sequence the sample.
- Column 4: name of the fastq file.
- Column 5: sample description as described in the article.
- Column 6: replicate number.
- Funk, Mathis; Spronken, Monique I.; Bestebroer, Theo M. et al. (2024). Transient RNA structures underlie highly pathogenic avian influenza virus genesis [Preprint]. Cold Spring Harbor Laboratory. https://doi.org/10.1101/2024.01.11.574333
