Dataset for genome-wide profiling of autophagy dynamics under nutrient availability in Saccharomyces cerevisiae

Chica, Nathalia 1 ; Andersen, Aram N.1; Orellana-Muñoz, Sara1; Garcia, Ignacio1; Nguéa P, Aurélie1; Nakken, Sigve1; Ayuda-Durán, Pilar1; Håkensbakken, Linda1; Schultz, Sebastian W.1; Rødningen, Eline1; Putnam, Christopher D.2; Zucknick, Manuela3; Rusten, Tor Erik1 3; Enserink, Jorrit M.1

Research facility: Oslo University Hospital

Published Dec 06, 2025 on Dryad. https://doi.org/10.5061/dryad.cfxpnvxdh

Abstract

This dataset provides the complete quantitative data supporting all main and extended figures of the study “Time-resolved functional genomics using deep learning reveals global hierarchical control of autophagy”. It is part of the AutoDRY resource, a systems-level map of the genetic network controlling activation and inactivation of autophagy in response to nitrogen availability in yeast. The dataset includes time-resolved measurements of autophagy across 5,919 mutants, derived from high-content fluorescent imaging and automated image analysis using deep learning. Data were further processed through UMAP latent-space embedding and Bayesian factor analysis to infer contributions to autophagosome formation and clearance. The files include autophagy quantifications, statistical analyses, network analyses, and cross-omics integration using random forest modeling. Experimental validation data, including quantitative Pho8Δ60 assays, GFP-Atg8 and Ape1 processing, and qPCR analyses, are also provided. Together, these data enable full reproducibility of the study’s analyses and support reuse for further computational analysis or comparative studies of autophagy regulation.

File List and Descriptions

Data_File_S1.xlsx

Genome-wide libraries, strains, and primer sequences used in this study

This file contains multiple tabs that are divided into:

Genome-wide libraries and sub-libraries

YKO_A_library: Yeast haploid deletion collection (KO). List of nonessential deletion alleles screened in this study and control strains - crossed with JEY11511 (BG Y7092)

DAmP_A_library: Yeast haploid DAmP collection (D). List of essential DAmP alleles screened in this study and control strains -crossed with JE11511 (BG Y7092)

Mixed_library: Rearranged plates (Mix). List of nonessential and essential deletion strains rearranged from the rows A of YKO-A and DAmP-A to a new plate and position

Recovery_library: Recovery plates (Rec). List of KO and DAmP strains with too low cell counts during the initial screening, and that were re-inoculated and subjected to an extra round of screening

Repetition_library: Repetition plates (Rep). List of mutants with a statistically significant perturbation in autophagy or potential false negatives/positives that were subjected to a second measurement

Rapamycin_library: Rapamycin plates (Rapa). List of mutants subjected to rapamycin treatment and control treatment

Other libraries, strains, primers, and plasmids

● Strains_experimental purpose: Lists of alleles that were deleted in this study for the creation of reproducibility set and screen controls, and for performing autophagy assays

● Primers: Sequences of oligonucleotides used in this study

● Plasmids: List of plasmids used in this study

Genome-wide libraries are provided in tab-delimited format with the following columns

● ORF name

● Gene Name

● Plate

● Col

● Row

● Type

● Background: yeast strain background

● Source: Number in the JE collection or plate in the original KO-A or DAmP-A libraries

● Position source: Position in the original KO-A or DAmP-A libraries

● Conditions: screening conditions tested

Strains

● Number

● Genotype

● Source

● Experimental Purpose

Primers

● Name

● Sequence

● Provider

Plasmids

● Name

● Source

Data_File_S2.xlsx

Autophagy_predictions_Statistics_Clustering

This file contains multiple tabs:

Autophagy % DNN predictions: autophagy responses for the genome-wide (GW) screen

● Time: time point; TimeR: adjusted real time of image capture

● P1_30, P1_22: expected probability of autophagy, x100 (%)

● P1_binary_30, P1_binary_22: autophagy classification %

● P_err_30, P_err_22: expected standard deviation, x100

● LO_30, LO_22: expected log-odds

● LO_err_30, LO_err_22: standard deviation in log-odds

● NCells: cell counts

Autophagy % curvefits: double sigmoidal curvefits of autophagy response % for the GW screen

Goodness-of-fit measures: RMSE and log-likelihood of double sigmoidal curvefits for the GW screen

Parameter statistics: perturbation statistics for response kinetic parameters computed from double sigmoidal models

● Value: estimated kinetic parameter value

● Perturbation: change in parameter values from plate medians

● p_value, p_value_2t: one-tailed and two-tailed p-values

● FDR_BH: FDR-adjusted p-values using the Benjamini-Hochberg procedure

● w: HMP weights

● HMP: harmonic mean p-value per mutant

● FDR_min: minimum FDR-adjusted p-value per mutant

Parameter controls: kinetic parameter value, perturbations from plate medians, and plate-wise adjusted errors for the plate controls

Mutant profiles: clustered matrix of significant mutants, with classification of mutant profiles

Parameter groups: clustering results of kinetic parameters

Data_File_S3.xlsx

GSEA GO_kinetic parameter and HMP enrichment

This file contains multiple tabs:

Parameter GSEA_GO-BP results: GSEA results for GO-BP terms over each parameter perturbation statistics

Parameter GSEA_BO-BP_matrix: matrix of signed and -log-transformed enrichment p-values

HMP GSEA_GO_results_emap: GSEA results for GO terms over the HMP scores with enrichment map coordinates and autophagy perturbation profile composition per node

HMP GSEA_GO_results_emap: enrichment map network coordinates

Data_File_S4.xlsx

SAFE analysis of the global yeast genetic similarity and the STRING network (v10) cutoff 990

This file contains multiple tabs that are divided into:

Node and Edge column data of the genetic similarity network by Costanzo et al., 2016 were obtained from the Cytoscape session 'Yeast Genetic Interactions' (version 3.9.1)

Costanzo2016_node: This includes SAFE enrichment score per node (gene)

Costanzo2016_edge: This includes pair-gene interactions and correlation values per each interaction

SAFE analysis on the yeast genetic similarity network (Costanzo et al., 2016)

Costanzo_SAFE_node

Costanzo_SAFE_domain

Costanzo_SAFE_attribute

Costanzo_SAFE_autophagy profiles

Node and Edge column data from STRING network based o a combined score ≥ 990 of the STRING v10 database

STRING 990_node: This includes SAFE enrichment score per node (gene)

STRING 990_edge: This includes pair-gene interactions and correlation values per each interaction

SAFE analysis on the yeast STRING network 990 (v10)

STRING 990_SAFE_node

STRING 990_SAFE_domain

STRING 990_SAFE_attribute

STRING 990_SAFE_autophagy profiles

Costanzo and STRING SAFE annotations are provided in tab-delimited format with the following columns

Node

● Node label

● Node label ORF

● Domain (predominant): This indicates the predominant domain (group) defined by SAFE score

● Neighborhood score [max=1, min=0] (predominant): This indicates the confidence level of node annotation towards the predominant group from 0 to 1

● Total number of enriched domains: This represents the count of enriched GO term attributes corresponding to each domain (group) for each node.

● Number of enriched attributes per domain: The domain numbers are separated by commas.

Domain

● Domain number: This matches to number of the predominant domain group in the tab "Node."

● Domain name: Indicates a GO term that most accurately characterizes each group within the predominant domain.

● RGB code: This highlights the domain in the network.

Attribute

● Attribute Id: GO term number

● Attribute name: GO term name

● Domain Id: This matches to domain number in the tab "Domain"

Profiles

● name

● 1: Ultrasensitive

● 2: Hyposensitive

● 3: Hyperactive

● 4: Insufficient activation

● 5: Failed response

● 6: Null response

● ATG core

● FUSION core

Distance stats

Mean and SE of -log10 HMP, along with number of significant mutants, per path distance to core in different networks

Shortest path

Mean and SE of shortest path length per profile in different networks

Int. cor. Stats

Statistics of autophagy phenotype correlations within groups of gene pairs sharing edge in network or between genes not sharing an edge

Fisher's stats

Number of positive and negative SGA-PCC similarities per correlation interval for pairs of significant genes (HMP ≤ 0.001) with enrichment scores and corresponding p-values for direction Fisher's exact test.

SGA-PCC_GSEA stats

Statistics of two-sided t-tests of NES scores per mutant profiles over ATG or Fusion SGA-PCCs

SGA-PCC_GSEA group stats

GSEA statistics per mutants profile against ATG and Fusion gene SGA-PCCs

Distance per profile

Frequency distribution and enrichment of mutant profiles per causal path length to ATG core in the Costanzo 2016 (SGA-PCC) network. pvalue indicates the significance for a twp-sided Fisher’s exact test

Data_File_S5.xlsx

Autophagy perturbations across interacting protein modules spanning three regulatory network hierarchies

This file contains multiple tabs that are divided into:

Autophagy regulatory network

Autophagy_node

Autophagy_edge

RNA metabolism and translation

RNA_translation_node

RNA_translation_edge

Gene expression regulation

Gene_expression_node

Gene_expression_edge

Each module network is provided in tab-delimited format with the following columns

● Node

● Shared name/ Name

● ORF

● Profile: Autophagy perturbations identified in our study

● Cluster: Curated complexes identified through either the Complex Portal or the application of the MCODE algorithm

● HMP: (-log10)

● Perturbation_starvation: (signed -log10 p-value)

● Perturbation_replenishment: (signed -log10 p-value)

● Perturbation_overall: (signed -log10 p-value)

● slope1: (signed -log10 p-value)

● slope1Param: (signed -log10 p-value)

● slope2: (signed -log10 p-value)

● slope2Param: (signed -log10 p-value)

● A_start: (signed -log10 p-value)

● A_max: (signed -log10 p-value)

● A_final: (signed -log10 p-value)

● T50_1: (signed -log10 p-value)

● T50_2: (signed -log10 p-value)

● T_lag_1: (signed -log10 p-value)

● T_lag_2: (signed -log10 p-value)

● T_final_1: (signed -log10 p-value)

● T_final_2: (signed -log10 p-value)

● Dynamic_range_1: (signed -log10 p-value)

● Dynamic_range_2: (signed -log10 p-value)

Edge

● interaction/shared interaction

● name/shared name: gene-pair interaction

Data_File_S6.xlsx

Latent space statistics

This file contains multiple tabs:

UMAP statistics and entropy: summary statistics of UMAP dynamics and noise and average prediction uncertainty per mutant

● NCells: average number of cells per time point

● UMAP_30_SD, UMAP_22_SD: average time-wise bivariate standard deviation for UMAP populations

● UMAP_30_flux, UMAP_22_flux: average population UMAP flux

● UMAP_30_flux_norm, UMAP_22_flux_norm: SD-normalized average population UMAP flux

● Entropy_30, Entropy_22: average binary entropy for DNN predictions

LS and UMAP statistics: population standard deviation and displacement (flux) per time point for latent space vectors and UMAP coordinates

Data_File_S7.xlsx

Autophagy Bayes factors

This file contains multiple tabs:

Bayes factors overall

Bayes factors -N

Bayes factors +N

Bayes factors per time

Column information

● log_BF_WT.ATG1_30, log_BF_WT.VAM6_30, log_BF_VAM6.ATG1_30, log_BF_WT.ATG1_22, log_BF_WT.VAM6_22, log_BF_VAM6.ATG1_22: Time-invariant Bayes factors computed from time-wise reference kernel densities, averaged overall or per phase (-N, +N)

● log_BFt_WT.ATG1_30, log_BFt_WT.VAM6_30, log_BFt_VAM6.ATG1_30, log_BFt_WT.ATG1_22, log_BFt_WT.VAM6_22, log_BFt_VAM6.ATG1_22: Time-wise Bayes factors computed from fixed reference density kernels, averaged overall or per phase (-N, +N), or averaged per time point

● NCells: total number of cells

● P1_adj_30, P1_adj_22, P1_binary_adj_30, P1_binary_adj_22: plate median adjusted average autophagy % response or average binarized % class predictions

● Red, Green, GxR: mean fluorescent signal or co-occurrence per mutant early in the starvation protocol

● qRed, qGreen: outlier cutoff for mean fluorescent signal per mutant early in the starvation protocol

● Red.late, Green.late, GxR.late: mean fluorescent signal or co-occurrence per mutant late in the starvation protocol

● Bayes factors overall: Additonal column

● qoutliers_red, qoutliers_green: mutant flagged as outlier if Red < qRed (outlier cutoff) and if Green < qGreen (outlier cutoff) in Bayes factors overall

Data_File_S8.xlsx

Bayes factors GSE and z-scores

This file contains multiple tabs:

BF GSE overall: parametric GO enrichment statistics (using piano) for BFt z-scores

BF GSE phase comp.: parametric GO enrichment statistics (using piano) for BF z-scores per phase or differential z-score between phases (starvation - replenishment)

BF z-scores overall: z-scores of log Bayes factors averaged between models 30 and 22, overall

BF z-scores phase comp.: z-scores for log Bayes factors averaged between models 30 and 22, per phase and differential between phases (starvation - replenishment)

Data_File_S9.xlsx

Nearest neighbour and GO selection

This file contains multiple tabs:

Nearest neighbour selection: Dataset of summary statistics from the nearest neighbor enrichment of autophagy phenotypes using different yeast networks

● HMP, FDR_min_pert: mutant autophagy phenotype statistics

● HMP_Fisher: summary statistics for nearest neighbor autophagy phenotype enrichment

● pmin_Fisher, pmin.adj_Fisher: minimum p-value and adjusted p-value for nearest neighbor autophagy phenotype enrichment

● Data, Set, Count: Network, phenotype set, and nearest neighbor count with the most significant enrichment

GO selection: Autophagy GO term annotation of mutants

● HMP, FDR_min_pert: mutant autophagy phenotype statistics

● N, Description: Number and name of GO terms per mutant

Data_File_S10.xlsx

Autophagy_reproducibility_validation

This file contains multiple tabs:

Repetition responses: Autophagy response % for repetition screen mutants with corresponding GW screen measurements

Validation responses: Autophagy response % for validation screen mutants with corresponding GW screen measurements

Validation statistics: Double sigmoidal regression statistics for validation screen mutants

Data_File_S11.xlsx

Rapamycin screen results

This file contains multiple tabs:

Autophagy % DNN predictions: autophagy response % for rapamycin screen mutants under the different treatment conditions

UMAP Bayes factors: overall time-wise UMAP Bayes factors for rapamycin screen mutants under the different treatment conditions.

Data_File_S12.xlsx

WB, Pho8d60 statistics, EM area, APE1 assay

This file contains multiple tabs:

GFP measures: GFP cleavage % computation for all experiments. Band % is computed for two replicate recordings (different exposure times) and averaged.

GFP quant.: GFP cleavage % quantifications across biological replicas

GFP stats._RTGs: GFP cleavage statistics for RTG experiments using mixed effects models with replica ID as grouping variable. The table shows analysis results from two experiments under Experiment_id. Contrasts between specific conditions; the baseline reference is indicated.

GFP_BF comp.: comparing average GFP cleavage (0-8 hrs) with overall Bayes factors from the GW screen.

GFP_BF time comp.: comparing time-wise GFP cleavage with Bayes factors.

GFP_parameter comp.: comparing average GFP cleavage (0-8 hrs) with response kinetic parameters from the GW screen.

Rps6_Act1 measures: Rps6-p and Act1 band volume integrals. For Rps6 the values were averaged across two replicate recordings (different exposure times).

Rps6 quant.: quantification of normalized Rps6-p signal across biological replicas. GFP cleavage % for equivalent experiments are also shown.

Rps6_BF comp.: comparing Rps6-p with GFP cleavage and overall Bayes factors from the GW screen.

Pho8d60 measures WTvsMut: computation of normalized Pho8d60 activities for all experiments. Pho8d60 stats. WTvsMut: statistics of RTG effects on normalized Pho8d60 activities using ANOVA and TukeyHSD test. The adjusted p-values are shown for pairwise comparisons between the mutants and the negative control (WT) for each treatment phase.

Pho8d60 measures DKOvsRTG1: computation of normalized Pho8d60 activities for WT, RTG1 deletion, ATG1 deletion and DKO mutants.

Pho8d60 stats DKOvsRTG1: statistics of RTG and ATG1 effects on normalized Pho8d60 activities using ANOVA and TukeyHSD test. The adjusted p-values are shown for pairwise comparisons between the mutants and the negative control (WT) for each treatment phase.

EM AB area: autophagic body (AB) area quantifications in electron micrographs.

Ape1 WB quant.: quantifications of normalized Ape1 band volume integrals relative to Act1 controls across biological replicas. Quantifications of Ape1 processing % are also shown. For each metric, a mean and standard error are computed.

Data_File_S13.xlsx

RT-qPCR statistics

This file contains multiple tabs:

Input data: computation of normalized expression values for all experiments and performance of Combat batch corrections.

Statistics pairwise: t-test statistics for pairwise comparisons between mutants and negative controls (WT) for each time point.

Statistics per phase: statistics for comparing overall mutant effects per treatment phase (+N T0, -N, and +N).

Data_File_S14.xlsx

RF_importance scores

This file contains multiple tabs:

Importance scores: sumGain values for features predicting TF autophagy responses. Input represent the input dataset that was used; Data indicates whether the sample labels were randomized (control).

SHAP scores: average SHAP values computed for input values representing up-regulation or down-regulation of gene expression for a specific feature.

SHAP stats: summary statistics of SHAP values including mean and standard error of upregulated, downregulated, and differential gene expression across different input dataset, autophagy response and stages.

Data_File_S15.zip

Support data (data folder)

This folder contains the .RData files containing compressed data for various analyses performed in the study.

Training data: Curated reference dataset used for optimization and training of DNN models. The dataset has been scaled. The subset of the dataset used for training of the DNN models is indicated as a logical column vector, and the state labels for this subset are indicated in a separate column (0=: no autophagy; 1=:autophagy). Vectors of the feature names, and mean and sd scale factors are also included.

Training features and scale factors: Contains vectors of the feature names, and mean and sd scale factors.

Network data: Contains datasets with the network data used in the study.

● df_BioGRID_PPI: protein-protein interactions. Oughtred et al. 2016

● df_complexes: parsed complexome annotation data. Meldal et al. 2021

● df_SGA_PCC: complete genetic similarity matrix from Costanzo et al. 2016

● df_SGA_IDs: dataset for mapping alleles to ORFs in Costanzo et al. 2016

● df_STRING_v10: STRING version 10 with protein-protein interactions predicted by a combined_score ≥ 200. Szklarczyk et al. 2015

● ORF2SYMBOL

● UNIPROT2ORF

Note: NaN in df_SGA_PCC indicates not available numbers as originally provided by Costanzo et al. 2016.

Cross-omics benchmarking datasets: Contains cleaned and matched input omics datasets from various sources and output variables based on autophagy perturbation parameter statistics (signed -log p-values) or Bayes factors:

● X_gex: scaled gene expression data from Kemmeren et al. 2014

● X_prot: differential protein expression levels (log2-fold) from Messner et al. 2023

● X_met: metabolomics dataset of amino acid levels from Mülleder et al. 2016

● X_SGA_PCC: genetic similarity dataset from Costanzo et al. 2016

● X_STRING_PCC: STRING v10 protein similarity dataset computed from the PCC correlations across the binary interaction matrix of STRING interactions with a combined_score ≥ 200. Szklarczyk et al. 2015

● Y_parm: transformed kinetic perturbation p-values

● Y_BF: time-wise Bayes factors (BFt) per mutant overall or per phase

Hackett_imputed data paired: Contains cleaned and matched datasets of adjusted expression dynamics (X; Hackett et al. 2020) and autophagy % or BFt perturbation dynamics, annotated for TFs and matching pseudotimes.

Data_File_S16.zip

Deep learning tools: DNNs for predicting autophagy, pUMAP embedding of autophagic features, and density kernels for UMAP reference states (data folder)

This folder contains the following objects:

Tensorflow HDF5 files of DNNs for classifying autophagic cell states from image features (p=31):

● TF_Model_WT_ID22_May_2021.h5

● TF_Model_WT_ID30_May_2021.h5

Tensorflow HDF5 files of DNN-based fast pUMAP embedding of latent space feature vectors. The models take the last hidden layers from the corresponding autophagy classifiers as inputs and transform them into two-dimensional UMAP coordinates:

● TF_pUMAP_Model_WT_ID22_August_2021.h5

● TF_pUMAP_Model_WT_ID30_August_2021.h5

UMAP density kernels: .RData files containing lists of density kernels for UMAP reference states (WT, VAM6, and ATG1). Density kernels are provided for the output of each pUMAP model, using both fixed and time-wise reference distributions. For likelihood computation, use dkde function from the ks package in R, with x set to the corresponding UMAP coordinates and fhat set to the specific density kernel.

Data_File_S17.zip

Genome-wide library of single cell autophagy phenotype data (data folder)

This folder contains the .RData files containing compressed single-cell screen data. Each data file contains the results for each 96-well plate divided into four R data frames:

● X.data: Plate, mutant, treatment information, time and well coordinates for every single-cell entry corresponding to rows in the other data frames.

● X: Scaled single-cell image features used as input for the DNN models.

● Y: Predicted autophagy (P1) per cell for each DNN model (22 and 30)

● Y.umap: Predicted UMAP coordinates representing the latent autophagy feature space per cell for each DNN model (22 and 30)

Data_File_S18.xlsx

Vesicle tracking and quantification

This file contains multiple tabs:

RTG measures: table of FIJI object quantifications for vesicles and cells, which have been tracked and assigned unique vesicle and cell trace IDs, respectively. Every vesicle has been assigned to a specific cell. Overlap coefficients per cell and vesicle have been computed.

RTG stat. area: t-test statistics for pairwise comparisons of RTG1 deletion effects on log area (a.u.) in WT and VAM3 deletion mutants.

RTG stat. counts: statistics for autophagosome counts per time point in WT, RTG1, VAM3, and DKO mutants.

R_code_Chica_et_al.zip

This zip contains R scripts for data analysis and figures presented in Chica et al.,

Dataset for genome-wide profiling of autophagy dynamics under nutrient availability in Saccharomyces cerevisiae

Data files

Abstract

README: Genome-wide profiling of autophagy in Saccharomyces cerevisiae

File List and Descriptions

Methods

Works referencing this dataset