Matlab code to calibrate a structured-PDE model to data from in vitro experiments
Cite this dataset
Villa, Chiara (2023). Matlab code to calibrate a structured-PDE model to data from in vitro experiments [Dataset]. Dryad. https://doi.org/10.5061/dryad.1rn8pk118
Abstract
Matlab code for the calibration of a PDE model of evolutionary dynamics of a well-mixed population of aggressive breast cancer cells from in vitro data on MCF7-sh-WISP2 cell line and bootstrapping for uncertainty quantification. For more details, see the associated publication: "Evolutionary dynamics of glucose-deprived cancer cells: insights from experimentally-informed mathematical modelling", by L. Almeida, J. Denis, N. Ferrand, T. Lorenzi, A. Prunet, M. Sabbah, C. Villa (corresponding author, author of code), 2024. The manuscript is published in the Journal of the Royal Society Interface, https://doi.org/10.1098/rsif.2023.0587
README: Matlab code to calibrate a structured-PDE model to data from in vitro experiments (MLE+bootstrapping)
https://doi.org/10.5061/dryad.1rn8pk118
Matlab code used to calibrate (and simulate) the partial differential equation model proposed in [1] with the experimental data described in the same manuscript, as well as to perform uncertainty quantification via a bootstrapping algorithm. See [1] and its Supplementary Material for more details.
[1] Evolutionary dynamics of glucose-deprived cancer cells: insights from experimentally-informed mathematical modelling, L. Almeida, J. Denis, N. Ferrand, T. Lorenzi, A. Prunet, M. Sabbah, C. Villa, Journal of the Royal Society Interface, 21(210), 20230587, 2024. http://doi.org/10.1098/rsif.2023.0587
\
Also available on ArXiv [arXiv:2301.08023] and HAL [hal-03947209v2]
Description of the data and file structure
File 'main_Calibration': main file to conduct main calibration of model with data. Calls 'fun_ExpData', 'fun_ParDomain' and 'fun_iterMLE'.
File 'fun_ExpData': function to store experimental data (2 data sets, averages and standard deviation).
File 'fun_ParDomain': function to store lower and upper bounds of parameter domains in which to conduct calibration. Please make sure that the order in which they are stored reflects that in which they are read whenever the function is called.
File 'fun_iterMLE': function called to find maximum likelihood estimate (MLE) in a given parameter range iteratively, that is by: calling 'fun_MLE', shrinking the domain of the parameters around the MLE found, and repeating up to Niter times. Neval can be modified to choose the maximum number of evaluations each time 'fun_MLE' is called. Recommended for large parameter domains (otherwise set Niter=1 in 'main_Calibration').
File 'fun_MLE': function called to find maximum likelihood estimate (MLE) in a given parameter range using the Matlab builtin function 'bayesopt'. Contains the function minimised by bayesopt, 'fun_mLikelyhood', storing the definition of the likelihood. The function 'fun_mLikelyhood' calls 'fun_SimPDE'.
File 'fun_SimPDE': function to simulate the PDE model for a given parameter set and returning the summary statistics. Please ensure that the type and order of summary statistics returned matches those they are compared with in 'fun_mLikelyhood' at the end of the 'fun_MLE' file.
File 'main_UQ': main code to run in order to conduct uncertainty quantification using bootstrapping. Calls 'fun_ExpData', 'fun_ParDomain' and 'fun_iterMLE', in a similar fashion to 'main_Calibration'. Moreover calls 'fun_resample' to resample data for bootstrapping (or correspondent saved data in .mat files provided), and 'fun_SimPDE' for postprocessing (plotting calibrated model predictions and empirical 95% confidence interval against data used to calibrate the model). Also calls inbuilt Matlab functions 'ksdensity' and 'randsample' during post-processing.
Further details of each file (methodology, input requirements, etc) are included at the top of each file.
Data files ('.mat' extension):
- 'OPS_ALT_rescue': Optimal Parameter Set obtained applying the above procedure to the main model presented in [1]. The vector 'y' stores the values in the OPS of parameters [gammag,gammal,alphaG,alphaL,d,kappaG,kappaL,etaL,beta,lambdaL,yL,yH,zeta,nh,mh], and 'yval' stores the corresponding value of the Weighted Sum of Squared Residuals R (see also column 3 of Table S1 of Supplementary Material in [1]).
- 'OPS_NODIFF_0920': Optimal Parameter Set obtained applying the above procedure to the model without diffusion (\Phi=0) in [1]. The vector 'y' stores the values in the OPS of parameters [gammag,gammal,alphaG,alphaL,d,kappaG,kappaL,etaL,lambdaL,yL,yH,nh,mh], and 'yval' stores the corresponding value of the Weighted Sum of Squared Residuals R (see also column 4 of Table S1 of Supplementary Material in [1]).
- 'OPS_NODRIFT_0920': Optimal Parameter Set obtained applying the above procedure to the model without drift (\Psi=0) in [1]. The vector 'y' stores the values in the OPS of parameters [gammag,gammal,alphaG,alphaL,d,kappaG,kappaL,etaL,beta,yL,yH,zeta,nh,mh], and 'yval' stores the corresponding value of the Weighted Sum of Squared Residuals R (see also column 3 of Table S1 of Supplementary Material in [1]).
- 'OPS_ALT_rescue': Optimal Parameter Set obtained applying the above procedure to the main model presented in [1] when calibrated also using data from the Rescue experiment. The vector 'y' stores the values in the OPS of parameters [gammag,gammal,alphaG,alphaL,d,kappaG,kappaL,etaL,beta,lambdaL,yL,yH,zeta,nh,mh], and 'yval' stores the corresponding value of the Weighted Sum of Squared Residuals R (see also column 2 of Table S3 of Supplementary Material in [1]).
- 'ResampledData': saved matrix containing resampled data sets for bootstrapping, as produced by the function 'fun-resample' called in the file 'main_UQ'. Using 'load('ResampledData.mat')' instead of '[perm_U] = fun_resample(Ud1,Ud2);' in the code saves computational time.
- 'ResampledMLE_par': stores matrix with OPS obtained performing bootstrapping on the main model in [1]. This was obtained in the 'Iterate and get bootstrapping MLE' section of the 'main_UQ' file. Commenting this out and ensuring all lines with 'load('ResampledMLE_par.mat')' are not commented saves computational time when running the post-processing part of the file.
Sharing/Access information
Links to other publicly accessible locations of the data:
Code/Software
The code is provided in Matlab. Requires Statistics and Machine Learning Toolbox and Bayesian Optimization Toolbox.
If you use this software (available on the linked Zenodo repository under the GNU GPL license) in your work then please cite [1] and this repository.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details, https://www.gnu.org/licenses/.
Methods
Experimental data on MCF7-sh-WISP2 cells are used to carry out model calibration through a likelihood-maximisation method. In summary, the optimal parameter set (OPS) is obtained, through an iterative process, by minimising the weighted sum of squared residuals, employing the average and standard deviation of summary statistics from the experimental data, exploiting the in-built Matlab function bayesopt, which is based on Bayesian Optimisation. At each iteration, the PIDE-ODE system that constitutes the model is solved numerically using the FTCS method. Uncertainty quantification of the OPS was carried out by means of a bootstrapping algorithm, based on random sampling of data with replacement and particularly suited when only a few data are available. See the Supplementary Material of the associated publication for more details.
Funding
European Research Council, Award: grant agreement No 740623, European Union’s Horizon 2020 research and innovation
Henri Poincaré Institute