Kinetic modules in biochemical networks/ Upstream Algorithm

Langary, Damoun1; Kueken, Anika 1 ; Nikoloski, Zoran1

Published Mar 05, 2025 on Dryad. https://doi.org/10.5061/dryad.7pvmcvf4v

Data files

Mar 05, 2025 version files 1.89 GB

README.md
10.26 KB
Upstream_Algorithm_Dryad.zip
1.89 GB

Abstract

Modules represent fundamental building blocks of cellular networks, and are thought to facilitate robustness of phenotypes against perturbations. While reaction kinetic shapes the concentration of components and reaction rates, its use in identification of modules entails knowledge of parameter values. Here we demonstrate that kinetic modules can be efficiently identified based on steady-state reaction rate couplings in large-scale biochemical networks endowed with mass action kinetics without knowledge of parameter values. We then link the kinetic modules of metabolic networks with robustness of metabolite concentrations to perturbations. Analyzing 34 metabolic network models of 26 organisms, we demonstrate that the ordered binding enzyme mechanism leads to increased concentration robustness compared to random binding. Our findings pave the way for usage of modules in synthetic biology and biotechnological applications.

https://doi.org/10.5061/dryad.7pvmcvf4v

The repository contains code, data, and (intermediate) results that allow the identification of

balanced complexes
concordant complexes
kinetic modules

The identification of kinetic modules allows one to find metabolites of absolute concentration robustness and pairs with absolute concentration ratio robustness in large-scale metabolic networks.

Dependencies

Matlab (tested with 2023b / 2024a)
R (tested with R-4.3.0), packages igraph and R.matlab
COBRA toolbox (https://opencobra.github.io/cobratoolbox/stable/index.html)

to compare result with full coupling based on stoichiometry
- F2C2 tool (https://pubmed.ncbi.nlm.nih.gov/22524245/) (used with glpk)

Main functions and scripts

Extract the Upstream_Algorithm_Dryad.zip folder and move into folder Upstream_Algorithm_Dryad/Upstream_Algorithm/ add the required tools (e.g. cobratoolbox) to your current directory. To run the code, download models from Zenodo or add folder Models/original/ and locate .xml files of models that you want to analyze.

To run the whole pipeline

To run the whole pipeline, execute run_kinetic_module_workflow.m
The code will also add the folder Code/ and its subfolders to Matlab's search directory
The code executes the workflow for the Arabidopsis Core Model located in Models/original
To run the code for another model, modify variables files and f as described in the script

Models

The original metabolic models downloaded from BiGG database or original publication can be found from related supplementary information on Zenodo

The kinetic model of E. coli can be downloaded from its original publication, DOI: 10.1038/ncomms13806
An overview of the original publications is provided in Overview_model_original_publications.xlsx

Model reactions were split assuming a fixed order of substrate binding or random binding
The resulting models after preprocessing and splitting can be found in Models/models_with_elementary_steps

Code

Functions to run individual steps e.g. identification of balanced and concordant complexes can be found in folder Code/

The file run_kinetic_module_workflow.m will execute the functions given under Code. Hereby run_preprocessing.m is the first and will execute itself functions provided in the folder Code/preprocessing/. The folder Code/balanced_concordant_complexes/ includes fucntions related to the identifcation of balanced and concordant complexes. The folder Code/kinetic/modules/ includes the functions used to identify and summarize the kinetic modules in metabolic networks.

Functions inside the subfoldes of Code/ are not dependent on each other and can be used seperately with one exception which is Code/preprocessing/get_AY_matrix.r that depends on Code/preprocessing/deficiency_sparse.

Results

The folder Results/ includes the following subfolder:

concordant/
the files include the model structure after splitting, the set of balanced complexes (B), concordant complexes (CC)
for details on model variables check comments in the example workflow or individual functions
Figures/
code to generate result figures
MetDouble/ and MetSingle
metabolite pairs with absolute concentration ratio robustness and metabolites with absolute concentration robustness, respectively
MetDouble_* and MetSingle_* also considers enzymes and enzyme-substrate complexes,
M_MetDouble_* and M_MetSingle_* includes free metabolites only
(obtained from running script Results/write_M_MetSingle_M_MetDouble.m),
if the generated .csv file in folder MetSingle or MetDouble is empty no metabolite with absolute concentration robustness or metabolite pair with absolute concentration ratio robustness could be found for the respective network.
Overview/
The files combined resulted in Supplementary Table 1 of the related manuscript (Table_ED-1.xlsx). Table_ED-1.xlsx includes the following information:
- general information (column A-H, no color)
- overview of results related to kinetic modules and their size (column I-P, green color)
- overview of results related to absolute concentration (ratio) robustness (column Q-X, orange color)
- overview of results from stoichiometric coupling (column Y-AD, blue color)

Column	Information
A	Model number for counting
B	Binding type considered (ordered or random)
C	Model name
D	Organism
E	Number of complexes in the model
F	Number of reactions in the model
G	Number of free metabolites in the model
H	Number of free metabolites, enzymes, and enzyme-metabolite complexes in the model
I	Number of kinetic modules
J	Number of kinetic modules with one complex
K	Maximum size (number of complexes) of kinetic module
L	Average size of kinetic modules
M	Standard deviation of size of kinetic modules
N	% of complexes from all model complexes found in largest kinetic module
O	Number of reactions whose substrate is in the largest kinetic module
P	% of reactions with substrate in largest kinetic module from all model reactions
Q	Number of components with absolute concentration robustness
R	% of components with absolute concentration robustness from all components (see column H)
S	Number of component pairs with absolute concentration ratio robustness
T	% of component pairs with absolute concentration ratio robustness from all unique combinations of two components
U	Number of free metabolites with absolute concentration robustness
V	% of free metabolites with absolute concentration robustness from all free metabolites (see column G)
W	Number of free metabolite pairs with absolute concentration ratio robustness
X	% of free metabolite pairs with absolute concentration ratio robustness
Y	Number of modules identified from stoichiometric coupling
Z	Number of modules with one complex identified from stoichiometric coupling
AA	Maximum size of module identified from stoichiometric coupling
AB	Average size of modules identified from stoichiometric coupling
AC	Standard deviation size of modules identified from stoichiometric coupling
AD	Fraction size of largest kinetic module to size of largest module identified from stoichiometric coupling

The script CreateResultTable.m combinds the Overview_*.csv results for the individual networks to one joint table.

Reactions_Giant/
List of reaction indices that belong to the giant kinetic module
stoichiometric coupling
matrices obtained form F2C2
files *.RData in Results/
results of running the Upstream algorithm in Code/kinetic_modules/code_kineticModule_analysis.R
the scipt write_M_MetSingle_M_MetDouble.m in the Results/ folder filters the free metabolites from the set of all identified components (enzymes, enzyme-substrate complexes, free metabolites) with absolute concentration robustness or pairs with absolute concentration ratio robustness
the script kinetic_E_coli_GEM_essential_giant.m checks if the largest kinetic module found in the kinetic model of E. coli shows a reduced number of essential reactions (Fisher exact test) in comparison to reactions not part of that module

Note: intermediate results of few models are missing as they were to large to be uploaded

Code/software

Matlab (tested with 2023b / 2024a)
R (tested with R-4.3.0), packages igraph and R.matlab
COBRA toolbox (https://opencobra.github.io/cobratoolbox/stable/index.html)

to compare result with full coupling based on stoichiometry
- F2C2 tool (https://pubmed.ncbi.nlm.nih.gov/22524245/) (used with glpk)

Access information

Other publicly accessible locations of the data:

https://github.com/ankueken/Upstream_Algorithm