Training dataset for Nile delta shoreline change prediction until 2050 using machine learning
Data files
Sep 09, 2025 version files 1.44 MB
-
datalong_cumu_NSM_new_inc_ang_latlong_no2022.csv
1.43 MB
-
README.md
9.34 KB
Abstract
This dataset contains shoreline position and environmental forcing data used to train, validate, and apply a shallow artificial neural network to forecast shoreline change in the Nile Delta. Shoreline positions were digitized from Landsat imagery (1992–2017) at 125-meter transect spacing across six geomorphologically distinct shoreline segments. Environmental variables include wave direction, wave period, swell height (ERA5), sea level rise (CMIP6/IPCC AR6), land subsidence, and land cover change (ESA CCI, ArcGIS Living Atlas). The 1992–2017 dataset was used for model training, the 2022 data for independent validation, and the trained model was applied to generate forecasts for 2030, 2040, and 2050 shoreline positions. Feature selection using Spearman correlation and permutation importance identified the most influential predictors. This dataset supports the reproduction of results reported in Forecasting Nile Delta Shoreline Change Until 2050 Using a Shallow Neural Network and applies to other coastal forecasting and risk assessment studies.
Dataset DOI: 10.5061/dryad.wh70rxx20
Description of the data and file structure
This data is for the Nile Delta shoreline. The satellite images were collected from the USGS LANDSAT database. Wave direction, wave period, and swell height were obtained from the ERA5 reanalysis archive, which provides historical climate information at a monthly resolution across multiple. Land cover data from 1992 to 2019 were derived from the ESA Climate Change Initiative and were extended to the year 2050 using projected layers from ArcGIS Living Atlas. Sea level rise and land subsidence data were sourced from the CMIP6 ensemble projections, the IPCC Sixth Assessment Report.
Files and variables
File: datalong_cumu_NSM_new_inc_ang_latlong_no2022.csv
Description: Model Training Variables
Variables
| Feature | ||
|---|---|---|
| Unit | Description | |
| TransID | meters | Represents the unique spatial identifier multiplied by transect spacing (125 m) for each transect at each fixed location from the start point at the west of the delta along the shoreline at different times (e.g., TransID 5 means 5*125 meters from the start point at East Alexandria) |
| Segment | unitless | A categorical variable representing one of the six shoreline segments, providing geographical information for similarly correlated transects. |
| Period | Year | This is the end year of any specific time interval for each set of data, representing the temporal changes over the same transects. |
| Azimuth | degrees | Represents the angular measurement of the shoreline's orientation from north at each transect. |
| epr_corr | unitless | EPR Correlations are values indicating the correlation between adjacent transects to highlight spatial dependencies, where high correlation values mean similar adjacent EPRs and low correlation values mean an abrupt change or switch in the EPR. |
| wv_dir | degrees | Wave Direction represents the mean direction of ocean surface waves, accounting for both wind-sea and swell waves given in degrees true, indicating where the waves are coming from (e.g., 0 degrees from the north, 90 degrees from the east) at each transect, which can relate to longshore transport. |
| wv_per | seconds | Wave Period represents the time between two successive waves for both locally generated wind-sea waves and distant swell waves, which can influence the number of sediment transport events. |
| sw_height | meters | Swell Height represents the average height of surface ocean waves, including both wind-sea waves and swell waves, collected every other month in each given year. Wind-sea waves are directly influenced by local winds, whereas swell waves are generated by winds at distant locations and times. It indicates the vertical distance between wave crests and troughs. |
| sl_rise | millimeters | Sea level rise difference value at each year and location relative to the 1992 sea level. |
| subsidence | millimeters | Subsidence difference value at each year and location relative to the 1992 sea level. |
| cropland | km2 | Cropland area change within the delta. |
| nat_veg | km2 | Natural vegetation area change within the delta. |
| urban | km2 | Urban area change within the delta. |
| bare | km2 | Bare area change within the delta. |
| lat | degrees | Latitude value of the data point at the shoreline. |
| long | degrees | Longitude value of the data point at the shoreline. |
| cumulative_NSM | meters | The cumulative net value change in shoreline position in reference to the 1992 shoreline. |
Code/software
Google Sheets can open "datalong_cumu_NSM_new_inc_ang_latlong_no2022.csv" as a csv file.
Access information
Other publicly accessible locations of the data:
• USGS Landsat database
• IPCC Sixth Assessment Report
• CMIP6 ensemble projections
• ESA ERA5 reanalysis archive
Data was derived from the following sources:
- USGS Landsat database, IPCC Sixth Assessment Report, CMIP6 ensemble projections, ESA ERA5 reanalysis archive.
