Dataset for genome-wide profiling of autophagy dynamics under nutrient availability in Saccharomyces cerevisiae
Data files
Dec 06, 2025 version files 9.42 GB
-
Chica_et_al_Data_description_Readme_20251105.rtf
342.56 KB
-
Data_File_S1.xlsx
1.13 MB
-
Data_File_S10.xlsx
1.86 MB
-
Data_File_S11.xlsx
1.51 MB
-
Data_File_S12.xlsx
295.06 KB
-
Data_File_S13.xlsx
175.02 KB
-
Data_File_S14.xlsx
5.23 MB
-
Data_File_S15.zip
931.97 MB
-
Data_File_S16.zip
49.18 MB
-
Data_File_S17.zip
8.29 GB
-
Data_File_S18.xlsx
64.50 MB
-
Data_File_S2.xlsx
52.67 MB
-
Data_File_S3.xlsx
1.25 MB
-
Data_File_S4.xlsx
4.49 MB
-
Data_File_S5.xlsx
298.21 KB
-
Data_File_S6.xlsx
4.01 MB
-
Data_File_S7.xlsx
13.42 MB
-
Data_File_S8.xlsx
2.14 MB
-
Data_File_S9.xlsx
595.88 KB
-
R_code_Chica_et_al.zip
168.17 KB
-
README.md
22.82 KB
Abstract
This dataset provides the complete quantitative data supporting all main and extended figures of the study “Time-resolved functional genomics using deep learning reveals global hierarchical control of autophagy”. It is part of the AutoDRY resource, a systems-level map of the genetic network controlling activation and inactivation of autophagy in response to nitrogen availability in yeast. The dataset includes time-resolved measurements of autophagy across 5,919 mutants, derived from high-content fluorescent imaging and automated image analysis using deep learning. Data were further processed through UMAP latent-space embedding and Bayesian factor analysis to infer contributions to autophagosome formation and clearance. The files include autophagy quantifications, statistical analyses, network analyses, and cross-omics integration using random forest modeling. Experimental validation data, including quantitative Pho8Δ60 assays, GFP-Atg8 and Ape1 processing, and qPCR analyses, are also provided. Together, these data enable full reproducibility of the study’s analyses and support reuse for further computational analysis or comparative studies of autophagy regulation.
File List and Descriptions
Data_File_S1.xlsx
Genome-wide libraries, strains, and primer sequences used in this study
This file contains multiple tabs that are divided into:
Genome-wide libraries and sub-libraries
YKO_A_library: Yeast haploid deletion collection (KO). List of nonessential deletion alleles screened in this study and control strains - crossed with JEY11511 (BG Y7092)
DAmP_A_library: Yeast haploid DAmP collection (D). List of essential DAmP alleles screened in this study and control strains -crossed with JE11511 (BG Y7092)
Mixed_library: Rearranged plates (Mix). List of nonessential and essential deletion strains rearranged from the rows A of YKO-A and DAmP-A to a new plate and position
Recovery_library: Recovery plates (Rec). List of KO and DAmP strains with too low cell counts during the initial screening, and that were re-inoculated and subjected to an extra round of screening
Repetition_library: Repetition plates (Rep). List of mutants with a statistically significant perturbation in autophagy or potential false negatives/positives that were subjected to a second measurement
Rapamycin_library: Rapamycin plates (Rapa). List of mutants subjected to rapamycin treatment and control treatment
Other libraries, strains, primers, and plasmids
● Strains_experimental purpose: Lists of alleles that were deleted in this study for the creation of reproducibility set and screen controls, and for performing autophagy assays
● Primers: Sequences of oligonucleotides used in this study
● Plasmids: List of plasmids used in this study
Genome-wide libraries are provided in tab-delimited format with the following columns
● ORF name
● Gene Name
● Plate
● Col
● Row
● Type
● Background: yeast strain background
● Source: Number in the JE collection or plate in the original KO-A or DAmP-A libraries
● Position source: Position in the original KO-A or DAmP-A libraries
● Conditions: screening conditions tested
Strains
● Number
● Genotype
● Source
● Experimental Purpose
Primers
● Name
● Sequence
● Provider
Plasmids
● Name
● Source
Data_File_S2.xlsx
Autophagy_predictions_Statistics_Clustering
This file contains multiple tabs:
Autophagy % DNN predictions: autophagy responses for the genome-wide (GW) screen
● Time: time point; TimeR: adjusted real time of image capture
● P1_30, P1_22: expected probability of autophagy, x100 (%)
● P1_binary_30, P1_binary_22: autophagy classification %
● P_err_30, P_err_22: expected standard deviation, x100
● LO_30, LO_22: expected log-odds
● LO_err_30, LO_err_22: standard deviation in log-odds
● NCells: cell counts
Autophagy % curvefits: double sigmoidal curvefits of autophagy response % for the GW screen
Goodness-of-fit measures: RMSE and log-likelihood of double sigmoidal curvefits for the GW screen
Parameter statistics: perturbation statistics for response kinetic parameters computed from double sigmoidal models
● Value: estimated kinetic parameter value
● Perturbation: change in parameter values from plate medians
● p_value, p_value_2t: one-tailed and two-tailed p-values
● FDR_BH: FDR-adjusted p-values using the Benjamini-Hochberg procedure
● w: HMP weights
● HMP: harmonic mean p-value per mutant
● FDR_min: minimum FDR-adjusted p-value per mutant
Parameter controls: kinetic parameter value, perturbations from plate medians, and plate-wise adjusted errors for the plate controls
Mutant profiles: clustered matrix of significant mutants, with classification of mutant profiles
Parameter groups: clustering results of kinetic parameters
Data_File_S3.xlsx
GSEA GO_kinetic parameter and HMP enrichment
This file contains multiple tabs:
Parameter GSEA_GO-BP results: GSEA results for GO-BP terms over each parameter perturbation statistics
Parameter GSEA_BO-BP_matrix: matrix of signed and -log-transformed enrichment p-values
HMP GSEA_GO_results_emap: GSEA results for GO terms over the HMP scores with enrichment map coordinates and autophagy perturbation profile composition per node
HMP GSEA_GO_results_emap: enrichment map network coordinates
Data_File_S4.xlsx
SAFE analysis of the global yeast genetic similarity and the STRING network (v10) cutoff 990
This file contains multiple tabs that are divided into:
Node and Edge column data of the genetic similarity network by Costanzo et al., 2016 were obtained from the Cytoscape session 'Yeast Genetic Interactions' (version 3.9.1)
Costanzo2016_node: This includes SAFE enrichment score per node (gene)
Costanzo2016_edge: This includes pair-gene interactions and correlation values per each interaction
SAFE analysis on the yeast genetic similarity network (Costanzo et al., 2016)
Costanzo_SAFE_node
Costanzo_SAFE_domain
Costanzo_SAFE_attribute
Costanzo_SAFE_autophagy profiles
Node and Edge column data from STRING network based o a combined score ≥ 990 of the STRING v10 database
STRING 990_node: This includes SAFE enrichment score per node (gene)
STRING 990_edge: This includes pair-gene interactions and correlation values per each interaction
SAFE analysis on the yeast STRING network 990 (v10)
STRING 990_SAFE_node
STRING 990_SAFE_domain
STRING 990_SAFE_attribute
STRING 990_SAFE_autophagy profiles
Costanzo and STRING SAFE annotations are provided in tab-delimited format with the following columns
Node
● Node label
● Node label ORF
● Domain (predominant): This indicates the predominant domain (group) defined by SAFE score
● Neighborhood score [max=1, min=0] (predominant): This indicates the confidence level of node annotation towards the predominant group from 0 to 1
● Total number of enriched domains: This represents the count of enriched GO term attributes corresponding to each domain (group) for each node.
● Number of enriched attributes per domain: The domain numbers are separated by commas.
Domain
● Domain number: This matches to number of the predominant domain group in the tab "Node."
● Domain name: Indicates a GO term that most accurately characterizes each group within the predominant domain.
● RGB code: This highlights the domain in the network.
Attribute
● Attribute Id: GO term number
● Attribute name: GO term name
● Domain Id: This matches to domain number in the tab "Domain"
Profiles
● name
● 1: Ultrasensitive
● 2: Hyposensitive
● 3: Hyperactive
● 4: Insufficient activation
● 5: Failed response
● 6: Null response
● ATG core
● FUSION core
Distance stats
Mean and SE of -log10 HMP, along with number of significant mutants, per path distance to core in different networks
Shortest path
Mean and SE of shortest path length per profile in different networks
Int. cor. Stats
Statistics of autophagy phenotype correlations within groups of gene pairs sharing edge in network or between genes not sharing an edge
Fisher's stats
Number of positive and negative SGA-PCC similarities per correlation interval for pairs of significant genes (HMP ≤ 0.001) with enrichment scores and corresponding p-values for direction Fisher's exact test.
SGA-PCC_GSEA stats
Statistics of two-sided t-tests of NES scores per mutant profiles over ATG or Fusion SGA-PCCs
SGA-PCC_GSEA group stats
GSEA statistics per mutants profile against ATG and Fusion gene SGA-PCCs
Distance per profile
Frequency distribution and enrichment of mutant profiles per causal path length to ATG core in the Costanzo 2016 (SGA-PCC) network. pvalue indicates the significance for a twp-sided Fisher’s exact test
Data_File_S5.xlsx
Autophagy perturbations across interacting protein modules spanning three regulatory network hierarchies
This file contains multiple tabs that are divided into:
Autophagy regulatory network
Autophagy_node
Autophagy_edge
RNA metabolism and translation
RNA_translation_node
RNA_translation_edge
Gene expression regulation
Gene_expression_node
Gene_expression_edge
Each module network is provided in tab-delimited format with the following columns
● Node
● Shared name/ Name
● ORF
● Profile: Autophagy perturbations identified in our study
● Cluster: Curated complexes identified through either the Complex Portal or the application of the MCODE algorithm
● HMP: (-log10)
● Perturbation_starvation: (signed -log10 p-value)
● Perturbation_replenishment: (signed -log10 p-value)
● Perturbation_overall: (signed -log10 p-value)
● slope1: (signed -log10 p-value)
● slope1Param: (signed -log10 p-value)
● slope2: (signed -log10 p-value)
● slope2Param: (signed -log10 p-value)
● A_start: (signed -log10 p-value)
● A_max: (signed -log10 p-value)
● A_final: (signed -log10 p-value)
● T50_1: (signed -log10 p-value)
● T50_2: (signed -log10 p-value)
● T_lag_1: (signed -log10 p-value)
● T_lag_2: (signed -log10 p-value)
● T_final_1: (signed -log10 p-value)
● T_final_2: (signed -log10 p-value)
● Dynamic_range_1: (signed -log10 p-value)
● Dynamic_range_2: (signed -log10 p-value)
Edge
● interaction/shared interaction
● name/shared name: gene-pair interaction
Data_File_S6.xlsx
Latent space statistics
This file contains multiple tabs:
UMAP statistics and entropy: summary statistics of UMAP dynamics and noise and average prediction uncertainty per mutant
● NCells: average number of cells per time point
● UMAP_30_SD, UMAP_22_SD: average time-wise bivariate standard deviation for UMAP populations
● UMAP_30_flux, UMAP_22_flux: average population UMAP flux
● UMAP_30_flux_norm, UMAP_22_flux_norm: SD-normalized average population UMAP flux
● Entropy_30, Entropy_22: average binary entropy for DNN predictions
LS and UMAP statistics: population standard deviation and displacement (flux) per time point for latent space vectors and UMAP coordinates
Data_File_S7.xlsx
Autophagy Bayes factors
This file contains multiple tabs:
Bayes factors overall
Bayes factors -N
Bayes factors +N
Bayes factors per time
Column information
● log_BF_WT.ATG1_30, log_BF_WT.VAM6_30, log_BF_VAM6.ATG1_30, log_BF_WT.ATG1_22, log_BF_WT.VAM6_22, log_BF_VAM6.ATG1_22: Time-invariant Bayes factors computed from time-wise reference kernel densities, averaged overall or per phase (-N, +N)
● log_BFt_WT.ATG1_30, log_BFt_WT.VAM6_30, log_BFt_VAM6.ATG1_30, log_BFt_WT.ATG1_22, log_BFt_WT.VAM6_22, log_BFt_VAM6.ATG1_22: Time-wise Bayes factors computed from fixed reference density kernels, averaged overall or per phase (-N, +N), or averaged per time point
● NCells: total number of cells
● P1_adj_30, P1_adj_22, P1_binary_adj_30, P1_binary_adj_22: plate median adjusted average autophagy % response or average binarized % class predictions
● Red, Green, GxR: mean fluorescent signal or co-occurrence per mutant early in the starvation protocol
● qRed, qGreen: outlier cutoff for mean fluorescent signal per mutant early in the starvation protocol
● Red.late, Green.late, GxR.late: mean fluorescent signal or co-occurrence per mutant late in the starvation protocol
● Bayes factors overall: Additonal column
● qoutliers_red, qoutliers_green: mutant flagged as outlier if Red < qRed (outlier cutoff) and if Green < qGreen (outlier cutoff) in Bayes factors overall
Data_File_S8.xlsx
Bayes factors GSE and z-scores
This file contains multiple tabs:
BF GSE overall: parametric GO enrichment statistics (using piano) for BFt z-scores
BF GSE phase comp.: parametric GO enrichment statistics (using piano) for BF z-scores per phase or differential z-score between phases (starvation - replenishment)
BF z-scores overall: z-scores of log Bayes factors averaged between models 30 and 22, overall
BF z-scores phase comp.: z-scores for log Bayes factors averaged between models 30 and 22, per phase and differential between phases (starvation - replenishment)
Data_File_S9.xlsx
Nearest neighbour and GO selection
This file contains multiple tabs:
Nearest neighbour selection: Dataset of summary statistics from the nearest neighbor enrichment of autophagy phenotypes using different yeast networks
● HMP, FDR_min_pert: mutant autophagy phenotype statistics
● HMP_Fisher: summary statistics for nearest neighbor autophagy phenotype enrichment
● pmin_Fisher, pmin.adj_Fisher: minimum p-value and adjusted p-value for nearest neighbor autophagy phenotype enrichment
● Data, Set, Count: Network, phenotype set, and nearest neighbor count with the most significant enrichment
GO selection: Autophagy GO term annotation of mutants
● HMP, FDR_min_pert: mutant autophagy phenotype statistics
● N, Description: Number and name of GO terms per mutant
Data_File_S10.xlsx
Autophagy_reproducibility_validation
This file contains multiple tabs:
Repetition responses: Autophagy response % for repetition screen mutants with corresponding GW screen measurements
Validation responses: Autophagy response % for validation screen mutants with corresponding GW screen measurements
Validation statistics: Double sigmoidal regression statistics for validation screen mutants
Data_File_S11.xlsx
Rapamycin screen results
This file contains multiple tabs:
Autophagy % DNN predictions: autophagy response % for rapamycin screen mutants under the different treatment conditions
UMAP Bayes factors: overall time-wise UMAP Bayes factors for rapamycin screen mutants under the different treatment conditions.
Data_File_S12.xlsx
WB, Pho8d60 statistics, EM area, APE1 assay
This file contains multiple tabs:
GFP measures: GFP cleavage % computation for all experiments. Band % is computed for two replicate recordings (different exposure times) and averaged.
GFP quant.: GFP cleavage % quantifications across biological replicas
GFP stats._RTGs: GFP cleavage statistics for RTG experiments using mixed effects models with replica ID as grouping variable. The table shows analysis results from two experiments under Experiment_id. Contrasts between specific conditions; the baseline reference is indicated.
GFP_BF comp.: comparing average GFP cleavage (0-8 hrs) with overall Bayes factors from the GW screen.
GFP_BF time comp.: comparing time-wise GFP cleavage with Bayes factors.
GFP_parameter comp.: comparing average GFP cleavage (0-8 hrs) with response kinetic parameters from the GW screen.
Rps6_Act1 measures: Rps6-p and Act1 band volume integrals. For Rps6 the values were averaged across two replicate recordings (different exposure times).
Rps6 quant.: quantification of normalized Rps6-p signal across biological replicas. GFP cleavage % for equivalent experiments are also shown.
Rps6_BF comp.: comparing Rps6-p with GFP cleavage and overall Bayes factors from the GW screen.
Pho8d60 measures WTvsMut: computation of normalized Pho8d60 activities for all experiments. Pho8d60 stats. WTvsMut: statistics of RTG effects on normalized Pho8d60 activities using ANOVA and TukeyHSD test. The adjusted p-values are shown for pairwise comparisons between the mutants and the negative control (WT) for each treatment phase.
Pho8d60 measures DKOvsRTG1: computation of normalized Pho8d60 activities for WT, RTG1 deletion, ATG1 deletion and DKO mutants.
Pho8d60 stats DKOvsRTG1: statistics of RTG and ATG1 effects on normalized Pho8d60 activities using ANOVA and TukeyHSD test. The adjusted p-values are shown for pairwise comparisons between the mutants and the negative control (WT) for each treatment phase.
EM AB area: autophagic body (AB) area quantifications in electron micrographs.
Ape1 WB quant.: quantifications of normalized Ape1 band volume integrals relative to Act1 controls across biological replicas. Quantifications of Ape1 processing % are also shown. For each metric, a mean and standard error are computed.
Data_File_S13.xlsx
RT-qPCR statistics
This file contains multiple tabs:
Input data: computation of normalized expression values for all experiments and performance of Combat batch corrections.
Statistics pairwise: t-test statistics for pairwise comparisons between mutants and negative controls (WT) for each time point.
Statistics per phase: statistics for comparing overall mutant effects per treatment phase (+N T0, -N, and +N).
Data_File_S14.xlsx
RF_importance scores
This file contains multiple tabs:
Importance scores: sumGain values for features predicting TF autophagy responses. Input represent the input dataset that was used; Data indicates whether the sample labels were randomized (control).
SHAP scores: average SHAP values computed for input values representing up-regulation or down-regulation of gene expression for a specific feature.
SHAP stats: summary statistics of SHAP values including mean and standard error of upregulated, downregulated, and differential gene expression across different input dataset, autophagy response and stages.
Data_File_S15.zip
Support data (data folder)
This folder contains the .RData files containing compressed data for various analyses performed in the study.
Training data: Curated reference dataset used for optimization and training of DNN models. The dataset has been scaled. The subset of the dataset used for training of the DNN models is indicated as a logical column vector, and the state labels for this subset are indicated in a separate column (0=: no autophagy; 1=:autophagy). Vectors of the feature names, and mean and sd scale factors are also included.
Training features and scale factors: Contains vectors of the feature names, and mean and sd scale factors.
Network data: Contains datasets with the network data used in the study.
● df_BioGRID_PPI: protein-protein interactions. Oughtred et al. 2016
● df_complexes: parsed complexome annotation data. Meldal et al. 2021
● df_SGA_PCC: complete genetic similarity matrix from Costanzo et al. 2016
● df_SGA_IDs: dataset for mapping alleles to ORFs in Costanzo et al. 2016
● df_STRING_v10: STRING version 10 with protein-protein interactions predicted by a combined_score ≥ 200. Szklarczyk et al. 2015
● ORF2SYMBOL
● UNIPROT2ORF
Note: NaN in df_SGA_PCC indicates not available numbers as originally provided by Costanzo et al. 2016.
Cross-omics benchmarking datasets: Contains cleaned and matched input omics datasets from various sources and output variables based on autophagy perturbation parameter statistics (signed -log p-values) or Bayes factors:
● X_gex: scaled gene expression data from Kemmeren et al. 2014
● X_prot: differential protein expression levels (log2-fold) from Messner et al. 2023
● X_met: metabolomics dataset of amino acid levels from Mülleder et al. 2016
● X_SGA_PCC: genetic similarity dataset from Costanzo et al. 2016
● X_STRING_PCC: STRING v10 protein similarity dataset computed from the PCC correlations across the binary interaction matrix of STRING interactions with a combined_score ≥ 200. Szklarczyk et al. 2015
● Y_parm: transformed kinetic perturbation p-values
● Y_BF: time-wise Bayes factors (BFt) per mutant overall or per phase
Hackett_imputed data paired: Contains cleaned and matched datasets of adjusted expression dynamics (X; Hackett et al. 2020) and autophagy % or BFt perturbation dynamics, annotated for TFs and matching pseudotimes.
Data_File_S16.zip
Deep learning tools: DNNs for predicting autophagy, pUMAP embedding of autophagic features, and density kernels for UMAP reference states (data folder)
This folder contains the following objects:
Tensorflow HDF5 files of DNNs for classifying autophagic cell states from image features (p=31):
● TF_Model_WT_ID22_May_2021.h5
● TF_Model_WT_ID30_May_2021.h5
Tensorflow HDF5 files of DNN-based fast pUMAP embedding of latent space feature vectors. The models take the last hidden layers from the corresponding autophagy classifiers as inputs and transform them into two-dimensional UMAP coordinates:
● TF_pUMAP_Model_WT_ID22_August_2021.h5
● TF_pUMAP_Model_WT_ID30_August_2021.h5
UMAP density kernels: .RData files containing lists of density kernels for UMAP reference states (WT, VAM6, and ATG1). Density kernels are provided for the output of each pUMAP model, using both fixed and time-wise reference distributions. For likelihood computation, use dkde function from the ks package in R, with x set to the corresponding UMAP coordinates and fhat set to the specific density kernel.
Data_File_S17.zip
Genome-wide library of single cell autophagy phenotype data (data folder)
This folder contains the .RData files containing compressed single-cell screen data. Each data file contains the results for each 96-well plate divided into four R data frames:
● X.data: Plate, mutant, treatment information, time and well coordinates for every single-cell entry corresponding to rows in the other data frames.
● X: Scaled single-cell image features used as input for the DNN models.
● Y: Predicted autophagy (P1) per cell for each DNN model (22 and 30)
● Y.umap: Predicted UMAP coordinates representing the latent autophagy feature space per cell for each DNN model (22 and 30)
Data_File_S18.xlsx
Vesicle tracking and quantification
This file contains multiple tabs:
RTG measures: table of FIJI object quantifications for vesicles and cells, which have been tracked and assigned unique vesicle and cell trace IDs, respectively. Every vesicle has been assigned to a specific cell. Overlap coefficients per cell and vesicle have been computed.
RTG stat. area: t-test statistics for pairwise comparisons of RTG1 deletion effects on log area (a.u.) in WT and VAM3 deletion mutants.
RTG stat. counts: statistics for autophagosome counts per time point in WT, RTG1, VAM3, and DKO mutants.
R_code_Chica_et_al.zip
This zip contains R scripts for data analysis and figures presented in Chica et al.,
- High-content fluorescent imaging
- Automatic image analysis using FIJI/Image J analysis
- Deep learning analysis
- Statistical analysis
- Network analysis
- UMAP latent-space and Bayes factors analysis
- Experimental validations and autophagy flux analysis using Western blotting analysis, quantitative Pho8Δ60 assay, and qPCR analysis
- Cross-omics analysis using Random forest
- Chica, Nathalia; Andersen, Aram N.; Orellana-Muñoz, Sara et al. (2024). Time-resolved functional genomics using deep learning reveals a global hierarchical control of autophagy [Preprint]. Cold Spring Harbor Laboratory. https://doi.org/10.1101/2024.04.06.588104
