Data from: Targeted computational design of an interleukin-7 superkine with enhanced folding efficiency and immunotherapeutic efficacy

Lim, SeeKhai 1 ; Hu, Che-Ming Jack 1 ; Mou, Chung-Yuan2

Research facility: Academia Sinica

Published Nov 18, 2025 on Dryad. https://doi.org/10.5061/dryad.9zw3r22v5

Data files

Nov 18, 2025 version files 168 MB

eLife-VOR-RA-2025-107671-sourcedata4.zip

167.99 MB
README.md

15.87 KB

Abstract

Interleukin-7 (IL-7) plays a central role in maintaining T cell development and immune homeostasis, and enhancing the cytokine’s immune-stimulatory functionality has broad therapeutic implications against various oncological malignancies. Herein, we show a computationally designed IL7 superkine, Neo-7, which exhibits enhanced folding efficiency and superior binding affinity to its cognate receptors. To streamline the protein candidate prediction and validation process, the loop region of IL7 was strategically targeted for redesign while most of the receptor-interacting regions were preserved. Leveraging advanced computational tools such as AlphaFold2, we show loop remodeling to rectify structural irregularities that allows for iterative stabilization of protein backbone and leads to identification of beneficial mutations conducive to receptor engagement. Neo-7 superkine shows improved thermostability and production yield, and it exhibits heightened immune-stimulatory and anticancer effect. These findings underscore the utility of a targeted computational approach for de novo cytokine development.

Dataset DOI: 10.5061/dryad.9zw3r22v5

Description of the data and file structure

File: eLife-VOR-RA-2025-107671-sourcedata3.zip

The source data is divided into different folders as follows:

Computational protein design

(delineating the important models used as design blueprint and outputs from the Alphafold design; readers are encouraged to use publicly available visualization software such as Pymol (for .pse format) and Biovia Discovery Studio (for.dsv format) to visualize the dataset.

-Figure2-source data.pse = 3DI2 (3D structure for WT-IL7; PDB ID 3DI2) superimposed with Neo-7 Helix only (Helix section from 3DI2 used to design Neo-7) and Neo-7 skeleton (Neo-7 structure with connecting loops designed to connect the helices).

-Figure3A-source data.pse = Superimposition of Neo7-LD1-AF-singleseq (3D structure for Neo-7-LD1 modeled using alphafold single sequence mode), Neo-7 Helix only (Helix section from 3DI2 used to design Neo-7 ); Neo7-LD1-AF-default (3D structure for Neo-7-LD1 modeled using alphafold default mode)

-Figure3B-source data.pse = Superimposition of Neo7-LD2 (3D structure for Neo-7-LD2 design modeled using alphafold single sequence mode), Neo-7 cw Helix only (Helix section from 3DI2 used to design Neo-7); Neo7-LD2_R5K_T44I (3D structure for Neo-7-LD2 design with R5K and T44I mutations modeled using alphafold single sequence mode).

-Figure 3C source data = Alphafold model of human IL-7 in complexation to human IL-7 receptor alpha

-Figure 3D source data = Superimposition of Neo-7 structures (with or without additional disulfide bridge) predicted by Alphafold.

-Figure4A-source data 1.pdb = PDB file listing 3D structure of Neo-7 Q6P and T45I mutants in complex with IL7 receptor alpha. Alphafold multimer

-Figure4A-source data 2.dsv = PDB file listing molecular interaction of Q6P mutation with IL7 receptor alpha modeled by Alphafold multimer.

-Figure4A-source data 3.dsv = PDB file listing molecular interaction of T45I mutation with IL7 receptor alpha modeled by Alphafold multimer.

Yeast display flow cytometry data:

Yeast display data of IL-7 and Neo-7 variants against IL7R-alpha and IL2R-Gamma (pre-binding to IL7R-alpha is required for IL2R-G binding). All data is provided as a standard flow cytometry file format (.fcs). Readers are encourage to use FlowJo to open the workspace file for deeper inspection of flow cytometry raw data and the gating strategies.

Description:

Yeast display flow-cytometry data:

-m-alpha (murine IL-7 receptor alpha); m-gamma (murine IL-2 family common receptor gamma)

-X-axis = signal intensity coresponding to the expression/displayed level of the displayed protein

-Y-axis = signal intensity corresponding to the binding of the displayed protein to the indicated receptor.

-Subfolder1: Figure3E-flow cytometry raw data\Figure3E-sourcedata

-m-alpha: yeast display data of displayed IL7/Neo7 using mouse IL7 receptor alpha as antigen

--Sample naming : (064 = Neo7-LD1/ 065=Neo7-LD1-R5K-T44I-I45T/ 075 50a=Neo7-LD2-R5K-T44I-I45T/ 076 50a=Neo7-LD2/ 095A=Neo7-LD1-Disulfide stapled-R5K-T44I-I45T/ 096A=Neo7-LD2-Disulfide stapled-R5K-T44I-I45T/ 100A=WT-IL7 with N-terminal Cys truncation/ 147 50 A = Wild type IL7/ yeast IL-7 libraryS0_Group_eby100 s= wild type control stained with antigens and secondary antibody/ yeast IL-7 libraryS0_Group_eby100 2only= wild type control stained with only secondary antibody/ yeast IL-7 libraryS0_Group_eby100 us= wild type unstain control)

-m-gamma : yeast display data of displayed IL7/Neo7 using mouse IL7 receptor alpha and mouse IL2 receptor gamma as antigen

-Figure4-source data mAlpha: yeast display data of displayed IL7/Neo7 using mouse IL7 receptor alpha as antigen

--Sample naming : [ 108A50=Neo7-LD2-Disulfide stapled-R5K-T44I-I45T-(Q6P)/ 109A50=Neo7-LD2-Disulfide stapled-R5K-T44I/ 11A50=Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P) ]

-Figure4-source data mGamma: yeast display data of displayed IL7/Neo7 using mouse IL2 receptor gamma (g-only) or mouse IL7 receptor alpha and mouse IL2 receptor gamma (g malpha + mgamma) as antigen.

--Sample naming : [108g only=Neo7-LD2-Disulfide stapled-R5K-T44I-I45T-(Q6P) stained with mouse IL2 receptor gamma only/ 109 g only=Neo7-LD2-Disulfide stapled-R5K-T44I stained with mouse IL2 receptor gamma only/ 110 g only=Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P) stained with mouse IL2 receptor gamma only/ 108 50a g=Neo7-LD2-Disulfide stapled-R5K-T44I-I45T-(Q6P) stained with mouse IL7 receptor alpha and mouse IL2 receptor gamma/ 109 50a g=Neo7-LD2-Disulfide stapled-R5K-T44I stained with mouse IL7 receptor alpha and mouse IL2 receptor gamma/ 110 50a g=Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P) stained with mouse IL7 receptor alpha and mouse IL2 receptor gamma]

In vitro data:

This section covers the SPR data from Biacore and cell-based data done in the study. All data is documented in excel sheet and readers are encouraged to use Microsoft Excel to inspect the data.

-Biacore_SPR_wdetails.xlsx

--Sample naming: NEO7-Q6P: Neo7-LD2-Disulfide stapled-R5K-T44I-I45T-(Q6P)/ NEO7-Q6PT45I: Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P)/ WT-IL7: Wild type IL7

Parameter	Unit	Meaning
ka	1/M·s	Association rate constant (binding speed)
kd	1/s	Dissociation rate constant (complex stability)
KD	M	Equilibrium dissociation constant (affinity = kd/ka)
Rmax	RU	Maximum theoretical response if all sites bound
Conc	M	Injected analyte concentration series
tc	s	Contact time during association phase
Flow	µL/min	Sample flow rate through sensor chip
kt	RU/M·s	Mass transport coefficient (diffusion rate to surface)
RI	RU	Refractive index shift from buffer mismatch
Chi²	RU²	Fit quality indicator (deviation between data and model)
U-value	—	Randomness of residuals (model validity check)

In vitro cell-based data

--Tab Figure 5H 2E8 cell: 2E8 cells, data collected using CCK8 assay (Absorbance at 460 nm).
- --Sample naming: NEO7-Q6P: Neo7-LD2-Disulfide stapled-R5K-T44I-I45T-(Q6P)/ NEO7-Q6PT45I: Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P)/ IL7-WT: Wild type IL7
--Tab Figure 6D-E spleenocyte : Data expressed as count of cells/100 uL of cell culture.
- --Sample naming: Isotype (FC only) control: Fc only recombinant protein/ FC-WT-Il7: Fc-wild type IL7 fusion/ Fc-NEO7-Q6P: Fc-Neo7-LD2-Disulfide stapled-R5K-T44I-I45T-(Q6P) fusion/ Fc-NEO7-Q6PT45I: Fc-Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P) fusion.

In vivo data:

The in vivo data covers all data obtained using mice subjects, for GrpahPad Prism 8 XML files readers are recomended to visualize the data using GraphPad Prism (version 8 or above). For RNA sequencing data, the source data for heatmap generation is included in Excel file format. The full report of RNA seq is included as Chrome HTML document format. Readers can assess those files using google chrome.

In vivo data:

Figure 6F-I N7 variants on mice PBMC: data expressed as count of cells per uL of mice peripheral blood

Figure7A tumor growth curve: tumor volume were expressed in the unit of mm3, body weights were expressed in grams.

Figure7B-C TILs analysis: data expressed as percentage of the indicated cell population per total cells analyzed.

RNA-seq report:

-DGE full table.xlsx: all gene analyzed from the RNA seq data for differential gene expression

-Figure8B source data NEO7_RNASEQ_GENEONTOLOGY_LOG2FOLD.pzfx: Top 10 annotated gene ontology that diiferentially expressed comparing the Neo-7 group to Isotype control group; Neo-7 group to wild type IL7 group; and wild type IL7 group to isotype group. The Neo-7 correspond to the following variant: Neo7-LD2-Disulfide stapled-R5K-T44I-(Q6P).

-Figure8C RNA seq Heatmap source data.xlsx : Sourced data for heat map generation using Z-score.

--Sample name: Isotype (IST) -1/2/3 ; Neo7-Double mutation(N7)-1/2/3; Wild type interleukin-7(I7)-1/2/3

--Tab: DGE-WT-IST (differential gene expression raw data comparing WT-IL7 and isptype control); DGE-N7-WT (differential gene expression raw data comparing Neo-7 and WT-IL7);

Column	Meaning
`row.names`	Gene identifier (e.g., gene symbol or Ensembl ID)
`baseMean`	Mean normalized expression of the gene across all samples
`log2FoldChange`	Estimated log₂ fold change between conditions (e.g., treated vs control)
`lfcSE`	Standard error of the log₂ fold change
`pvalue`	Wald test p-value for the null hypothesis (log₂FC = 0)
`padj`	Adjusted p-value (Benjamini–Hochberg FDR correction)

Gene ontology enrichment analysis from DAVID:

Column	Meaning
Sublist	The subset of genes analyzed (e.g., upregulated, downregulated, or user-defined list).
Category	The database or ontology category, such as GOTERM_BP_DIRECT, KEGG_PATHWAY, INTERPRO, etc.
Term	The specific biological term or pathway name (e.g., “cell cycle,” “oxidative phosphorylation”).
RT	(Optional field — in DAVID this sometimes means “Record Type” or an internal code used for referencing the database term.)
Count	Number of genes from your sublist associated with this term.
%	The percentage of your input genes associated with this term (`Count / total genes × 100`).
P-Value	Raw enrichment p-value (usually from a hypergeometric or Fisher’s exact test).
Benjamini	Benjamini–Hochberg–corrected p-value (false discovery rate, FDR). This column is the one to use for assessing significance (typically < 0.05).

--Tab: DGE-N7-IST (differential gene expression raw data comparing Neo-7 and isotype control); DGE-FC-WT-N7 (differential gene expression comparison of isotype contol, wild type IL7 and Neo-7 using log2fold change); Diff-gene (transformation of data from DGE-FC-WT-N7 tab for Z-score calculation, the calculation formula is showed below); Cell cycle (genes related to cell cycle); EFF (genes related to immune effector effects); EXINB (genes related to immune exhaustion inhibition); metabolism (genes related to T-cell metabolism); heatmap (heat map generated based on RNA expression profile); heatmap all (organized heat map generated based on RNA expression profile).

Zscore were calculated as using TPM-normalized RNA-seq data: Z=(X−μ)/σ

X = TPM of **gene **
μ = mean TPM across all samples
σ= standard deviation of gene TPM values across all samples
Z>0: higher-than-average expression of that gene in that sample
Z<0: lower-than-average expression
Z=0: expression equals the dataset mean for that gene

Column	Meaning
Geneid	Unique gene identifier from annotation file (e.g., GTF).
Symbol	Standard gene symbol (e.g., IL7R).
Description	Brief gene function/annotation.
EntrezID	NCBI Entrez Gene ID.
Counts_*	Raw read counts per sample (integer values before normalization).
TPM_*	“Transcripts Per Million” — normalized expression for comparability across samples.
log2FC_Neo7_DM_Isotype_control	Log₂ fold change between Neo-7 DM and Isotype Control groups.
log2FC_Neo7_DM_WT_IL7	Log₂ fold change between Neo-7 DM and WT IL-7 groups.
log2FC_WT_IL7_Isotype_control	Log₂ fold change between WT IL-7 and Isotype Control groups.
padj_Neo7_DM_Isotype_control	Adjusted p-value (FDR-corrected) for Neo-7 DM vs Isotype Control.
padj_Neo7_DM_WT_IL7	Adjusted p-value for Neo-7 DM vs WT IL-7.
padj_WT_IL7_Isotype_control	Adjusted p-value for WT IL-7 vs Isotype Control.
pvalue_Neo7_DM_Isotype_control	Raw p-value (unadjusted) for Neo-7 DM vs Isotype Control.
pvalue_Neo7_DM_WT_IL7	Raw p-value for Neo-7 DM vs WT IL-7.
pvalue_WT_IL7_Isotype_control	Raw p-value for WT IL-7 vs Isotype Control.

multiqc_report.html: The full report of RNA seq in HTML document format. Readers can assess those files using Google Chrome.

Code/software

-pymol session file (.pse): Pymol (free for academic)

-Biovia discovery studio file (.dsv): Biovia Discovery studio (free version visualizer)

-Microsoft Excel file (.xlsx): Microsoft Excel

-GraphPad Prism 8 XML file: GraphPad Prism (version 8 and above)

-Flow cytometry (.fcs) and flowjo workspace (.wsp) file: Flowjo

-Chrome HTML file: any browser

Access information

Other publicly accessible locations of the data:

Data was derived from the following sources: