Data from: A transfer learning-based hybrid surrogate modeling framework for efficient multi-objective seismic design of long-span cable-stayed bridges
Data files
Mar 12, 2026 version files 3.98 MB
-
README.md
16.38 KB
-
Seismic_Bridge_Design_Dataset.zip
3.96 MB
Abstract
This dataset contains the complete set of computational models, source code, and reference data supporting the research presented in the associated article titled “A transfer learning-based hybrid surrogate modeling framework for efficient multi-objective seismic design of long-span cable-stayed bridges.” It is organized into five main components. First, it includes finite element models consisting of SAP2000 models (version V10) for two long-span cable-stayed bridges, which serve as the high-fidelity simulation basis. Second, it provides neural network surrogate models with MATLAB (R2024b) source code used to construct and train three types of surrogate models: Backpropagation Neural Network (BPNN), Radial Basis Function Network (RBFN), and Generalized Regression Neural Network (GRNN); a pre-trained RBFN model for Bridge M1 is also included. Third, the dataset contains a transfer learning module implemented in MATLAB, which enables adaptation of the surrogate model trained for Bridge M1 to Bridge M2. Fourth, it includes a hybrid optimization framework in MATLAB for conducting multi-objective seismic design optimization while integrating the surrogate models. Fifth, the dataset provides parameter sampling data with MATLAB scripts for performing Latin Hypercube Sampling on Fluid Viscous Damper (FVD) parameters. All code is executable with clearly defined dependencies, and the included PEER ground motion information table supports replication of the seismic input data. Overall, this dataset enables full reproduction of the study’s numerical experiments and offers a reusable computational framework for researchers and engineers working on efficient seismic performance assessment and design optimization of long-span bridges.
https://doi.org/10.5061/dryad.5mkkwh7k9
Data Files
Seismic_Bridge_Design_Dataset.zip– Contains all MATLAB scripts, SAP2000 model files (including open formats), training data, and optimization results.
Description of the Data and File Structure
The dataset is organized into five main folders. All missing data are represented as NaN or handled appropriately in the code.
Folder 1: 1 SAP 2000 FEM model
This folder contains the SAP2000 v10 finite element models of two cable-stayed bridges (M1 and M2) and an Excel file with information about the ground motions used in the nonlinear time history analyses.
To improve accessibility, the model files are provided in three formats:
- SAP2000 proprietary format (
.SDB) – can be opened directly with SAP2000 v10. - SAP2000 text format (
.s2k) – a plain text file that contains the full model definition. This file can be viewed and edited with any text editor and can be imported into SAP2000 or other structural analysis software that supports the SAP2000.s2kformat. - Excel format (
.xls) – an export of the model data in a tabular layout, readable by Microsoft Excel and open-source spreadsheet applications (e.g., LibreOffice Calc, Google Sheets). This format allows users to inspect model parameters (geometry, material properties, loads, etc.) without specialized software.
The following files are included:
M1.SDB,M1.s2k,M1.xls
SAP2000 model of the reference bridge (M1). It includes:- Geometry: main span 1088 m, tower height 297.7 m, deck width, etc.
- Material properties for concrete and steel.
- Load cases: dead load, live load, and three earthquake ground motions (El Centro, Kobe, Northridge) with PGA scaled to values between 0.1g and 0.5g.
- The model is designed to be run in batch mode via the SAP2000 OAPI (Open Application Programming Interface) from MATLAB for parametric studies.
M2.SDB,M2.s2k,M2.xls
SAP2000 model of the target bridge (M2), which has a different structural configuration from M1. It is used to generate a small set of high‑fidelity data for transfer learning calibration. The model includes the same types of load cases as M1 but with modified geometry and material properties.Motion-information.xls
Microsoft Excel file containing metadata and parameters of the earthquake ground motions used in the analyses. It does not contain the actual acceleration time histories but rather information such as:- Earthquake name (e.g., El Centro, Kobe, Northridge)
- Recording station
- Original PGA (g)
- Duration (s)
- Other relevant characteristics (e.g., predominant period)
This file helps users understand the seismic inputs applied to the bridge models. The actual time history files are not included in the dataset; users must obtain them separately or use the information provided to select compatible records.
Folder 2: 2 Neural network model
This folder contains MATLAB scripts and data used to train three types of neural network surrogates (BPNN, GRNN, RBFN) on the dataset generated from the M1 bridge.
BPNN.m
MATLAB script that trains a Back-Propagation Neural Network (BPNN) surrogate.- Inputs: 7 design variables: PGA (g), damping coefficient C (kN·(s/m)^α), velocity exponent α (-), main span length L (m), tower height H (m), first two natural periods T₁ (s) and T₂ (s).
- Outputs: 4 seismic responses: maximum deck displacement D_max (m), base moment M_base (kN·m), maximum damper force F_dmax (kN), and damper stroke δ_d (m).
- The script uses the
newfffunction and trains the network with the Levenberg–Marquardt algorithm (trainlm). The trained network and normalization parameters are saved astrained_model1.mat.
GRNN.m
MATLAB script that trains a General Regression Neural Network (GRNN) surrogate usingnewgrnn. The input/output structure is identical to that of BPNN.RBFN.m
MATLAB script that trains a Radial Basis Function Network (RBFN) surrogate. It can use eithernewrb(incremental) ornewrbe(exact). The best-performing RBFN model is saved astrained_model1.mat.trained_model1.mat
MAT-file containing the trained neural network model (by default, the RBFN model) and the normalization parameters.net: the neural network object.ps_input: a structure with settings for input normalization (mapminmax).ps_output: a structure with settings for output normalization.
M1-DataSet-in.xlsx
Excel file containing the input-output dataset used to train the M1 surrogate. The data were generated by running parametric analyses on the M1 SAP2000 model.- Columns 1–7: input design variables (PGA, C, α, L, H, T₁, T₂).
- Columns 8–11: output seismic responses (D_max, M_base, F_dmax, δ_d).
- Units are as described above. The dataset contains several thousand samples covering the design space.
Folder 3: 3 Transfered surrogate model
This folder contains the MATLAB implementation of the transfer learning procedure that adapts the M1 surrogate to the M2 bridge using a limited number of new FEA results from M2.
main_transfer_learning_complete.m
Main script that performs the complete transfer learning calibration.
Workflow:- Loads the pre-trained M1 RBFN model from
trained_model1.mat(assumed to be located in the parent directory or copied here). - Loads the M2 FEA data from
M2_FEA_results.mat(this file is not included in the dataset; users must generate it by running the M2 SAP2000 model with the design points from the sampling scripts). - Analyzes the M2 data, detects constant features, and computes normalization parameters.
- Evaluates the direct transfer performance (i.e., applying the M1 model directly to M2 data).
- Generates augmented data (500 samples) by perturbing the existing M2 points.
- Extracts the RBF centers and spread from the pre-trained network and re-trains the output layer using ridge regression with an optimized regularization parameter.
- Performs cross-validation to select the best regularization parameter.
- Saves the calibrated model as
M2_calibrated_model_complete.matand the performance comparison asM2_performance_results.mat. - Produces diagnostic plots (scatter plots of predictions vs. true values) saved as
M2_calibration_results.png.
Input required:M2_FEA_results.mat(structure with fieldsX(N×7) andY(N×4), or a matrix with 11 columns where the first seven are inputs and the last four are outputs).
- Loads the pre-trained M1 RBFN model from
Folder 4: 4 Hybrid Surrogate optimization Framework
This folder contains the code for multi-objective optimization of FVD parameters (C, α) using the enhanced surrogate model obtained after transfer learning.
main_FVD_optimization.m
MATLAB script that performs NSGA-II multi-objective optimization to minimize the two conflicting objectives: D_max and F_dmax, subject to constraints on M_base, δ_d, and F_dmax.
Workflow:- Loads the enhanced surrogate model from
trained_model_enhanced.mat(which should be the output of the transfer learning script). - Sets the bridge parameters (PGA, L, H, T₁, T₂) and constraint limits (allowable moment, allowable stroke, damper force capacity).
- Defines the objective functions and constraints using the surrogate model.
- Runs
gamultiobjwith a smart initial population that includes both extreme and intermediate points. - After optimization, analyzes the Pareto front, computes performance metrics, and selects a recommended optimal solution using a normalized distance approach.
- Classifies the Pareto solutions into three design types:
- High-performance design (minimizes D_max – seismic priority)
- Economical design (minimizes F_dmax – cost priority)
- Balanced design (compromise between the two)
- Saves all results in a new folder
optimization_results_enhanced/, including:results_enhanced.mat: complete MATLAB data structure.pareto_solutions_enhanced.xlsx: all Pareto-optimal points with their design variables and performance.classified_designs.xlsx: the three representative design points.- Figures:
optimization_results_enhanced.pnganddesign_classification.png(also as.figfiles). plot_data/subfolder containing the raw data for the figures in CSV format and a MATLAB scriptredraw_optimization_plots.mto regenerate the plots from the saved data.
- Loads the enhanced surrogate model from
trained_model_enhanced.mat
MAT-file containing the enhanced surrogate model (RBFN) after transfer learning. It includes the same fields astrained_model1.mat:net,ps_input,ps_output.
Folder 5: 5 Sampling
This folder contains MATLAB scripts for two sampling strategies used to generate the calibration points for transfer learning.
-
sample_20p.m
Implements a three‑stage strategy to select 20 (PGA, C, α) combinations that are most informative for model calibration.
Stages:- Boundary exploration (6 points): selects extreme points covering all five PGA levels (0.1g, 0.2g, 0.3g, 0.4g, 0.5g) and the corners of the C‑α space.
- Uncertainty sampling (7 points): picks points with the highest predicted model uncertainty, based on the distance to already selected points (the farther, the more uncertain).
- Response diversity sampling (7 points): picks points in regions where the model response is expected to be most sensitive (near a sensitive region defined around C = 12000 kN·(s/m)^α and α = 0.5).
The script ensures that the final set of 20 points has a balanced distribution across the five PGA levels.
Outputs:
- Excel file:
Calibration_Points_20_yyyymmdd_HHMMSS.xlsxwith columns:Point_ID,PGA_g,C_kN_s_m_alpha,Velocity_Exponent_alpha,Selection_Stage. - MAT file:
Calibration_Points_20_yyyymmdd_HHMMSS.matcontaining the same data along with the parameter ranges and selection details. - A figure showing the selected points in 3D (PGA, C, α) and the PGA distribution.
-
sample_200p.m
Generates 200 points in the C–α space using Latin Hypercube Sampling (LHS) with the maximin criterion to achieve good space-filling properties. The PGA is fixed at 0.3g (the design earthquake level) for simplicity, but the user can modify the script to vary PGA if needed.
Workflow:- If the Statistics and Machine Learning Toolbox is available, it uses
lhsdesign; otherwise, a manual LHS implementation is used. - Scales the normalized samples to the actual ranges: C ∈ [2000, 25000], α ∈ [0.3, 1.0].
- Computes quality metrics: minimum point spacing, correlation between C and α, and marginal uniformity.
- Saves the design points in an Excel file
LHS_Samples_N200_yyyymmdd_HHMMSS.xlsxwith columnsSample_ID,C,α. - Also saves a MAT file with the same data plus the quality metrics.
- Produces a 2D scatter plot with marginal histograms to visualize the distribution.
- If the Statistics and Machine Learning Toolbox is available, it uses
Code/Software Requirements
- MATLAB R2022b or later is required to run all scripts. The following toolboxes are necessary:
- Optimization Toolbox (for
gamultiobj) - Statistics and Machine Learning Toolbox (for
lhsdesign,pdist,randsample, etc.) - Deep Learning Toolbox (for neural network functions such as
newff,newgrnn,newrb,sim,mapminmax) - Parallel Computing Toolbox (optional, can speed up the genetic algorithm).
- Optimization Toolbox (for
- SAP2000 version 10 is required to open and run the finite element model files in their native format (
.SDB). However, the model definitions are also provided in.s2k(text) and.xls(Excel) formats:- The
.s2kfiles can be viewed and edited with any text editor; they can also be imported into SAP2000 or other software that supports the SAP2000 text format. - The
.xlsfiles can be opened with Microsoft Excel or open‑source spreadsheet applications (e.g., LibreOffice Calc, Google Sheets) to inspect model parameters.
To perform parametric analyses from MATLAB using the original SAP2000 models, the OAPI must be enabled, and the.SDBfiles must be used. Basic knowledge of SAP2000 OAPI is assumed for users who wish to generate new data.
- The
- All custom MATLAB functions are included in the scripts; no external libraries or additional toolboxes are needed beyond those listed.
Usage Instructions
- Reproduce the M1 surrogate training:
- Open MATLAB, navigate to folder
2 Neural network model. - Run
RBFN.m(orBPNN.m/GRNN.m) to train the network. This will generatetrained_model1.mat. - The script automatically reads
M1-DataSet-in.xlsxand performs normalization and training.
- Open MATLAB, navigate to folder
- Generate M2 calibration data:
- If you have SAP2000 v10, load
M2.SDBfrom folder1 SAP 2000 FEM model. Use the OAPI interface (a separate MATLAB script is not included; users must write their own or use the provided sampling points to run analyses) to execute parametric runs with the design points generated fromsample_20p.morsample_200p.m. - If you do not have SAP2000, you can still inspect the model definition using the provided
.s2kor.xlsfiles. To generate new analysis results, you would need to recreate the model in another finite element package that supports the.s2kformat or use the exported Excel data as a reference. - Collect the results (input variables and output responses) and save them as a MAT file named
M2_FEA_results.matwith either:- a structure containing fields
X(N×7) andY(N×4), or - a matrix of size N×11 where the first 7 columns are inputs and the last 4 are outputs.
- a structure containing fields
- Place this file in folder
3 Transfered surrogate model.
- If you have SAP2000 v10, load
- Perform transfer learning:
- Ensure that
trained_model1.mat(from step 1) is present in folder3 Transfered surrogate model(or copy it there). - Run
main_transfer_learning_complete.m. This will produce:M2_calibrated_model_complete.mat: the calibrated surrogate model.M2_performance_results.mat: performance metrics comparing the direct M1 model and the calibrated model.M2_calibration_results.png: a scatter plot of predicted vs. true values for the four outputs.
- The script also displays detailed progress and results in the MATLAB command window.
- Ensure that
- Run multi‑objective optimization:
- Copy the calibrated model (
M2_calibrated_model_complete.mat) to folder4 Hybrid Surrogate optimization Frameworkand rename it totrained_model_enhanced.mat(or modify the script to load the correct file). - Run
main_FVD_optimization.m. The optimization may take several minutes depending on the population size (80) and number of generations (100). - All results will be saved in the subfolder
optimization_results_enhanced/. Examine the figures and tables to interpret the Pareto front and the recommended design.
- Copy the calibrated model (
- Generate new sampling points (optional):
- To create a custom set of 20 calibration points, edit
sample_20p.mto modify the PGA levels, parameter ranges, or the random seed, then run the script. - To generate a 200-point LHS design, run
sample_200p.m. The output can be used for uncertainty quantification or as input to the SAP2000 parametric runs.
- To create a custom set of 20 calibration points, edit
Access Information
- DOI: 10.5061/dryad.5mkkwh7k9
- Related publication: Han, Z.; She, D.; Liu, J. A Transfer Learning-Based Hybrid Surrogate Modeling Framework for Efficient Multi-Objective Seismic Design of Long-Span Cable-Stayed Bridges. Buildings 2026, 16, 904. https://doi.org/10.3390/buildings16050904
For questions, issues, or requests for collaboration, please contact the corresponding author at hanzf@hfuu.edu.cn.
How to cite this dataset:
Han, Z.; She, D.; Liu, J. (2026). Data from: A transfer learning-based hybrid surrogate modeling framework for efficient multi-objective seismic design of long-span cable-stayed bridges. Dryad Digital Repository. https://doi.org/10.5061/dryad.5mkkwh7k9
