Data from: Fishing ban halts seven decades of biodiversity decline in the Yangtze River
Data files
Feb 04, 2026 version files 305.99 MB
-
20260131_Yangtze_Fishing_Ban_Raw_data_and_R_code.zip
305.97 MB
-
README.md
18.77 KB
Abstract
This dataset includes the attachments related to the research conducted by Xiong et al., titled "Fishing ban halts seven decades of biodiversity decline in the Yangtze River". Our study involved evaluating effectiveness of the Yangtze fishing ban on improving biodiversity. We detected trends of biological indicators before and after the fishing ban. We also used Generalized Least-Squares-based Structural Equation Models to identify drivers of biological trends. Results from the current study bring hope for biodiversity recovery in other large rivers globally.
Dataset DOI: 10.5061/dryad.wdbrv1635
Description of the data and file structure
This dataset compiles the raw data and R analysis scripts related to a research project on the Yangtze River fishing ban.
The project integrates raw field data with a suite of advanced statistical analyses and visualizations in R.
It is structured into 11 thematic modules, each addressing a specific research question—from basic community metrics (diversity, size structure) to advanced topics like functional traits, spatial distribution, rare species conservation, and causal modeling.
It constitutes a finalized and cohesive analytical pipeline, designed to ensure full reproducibility and readiness for publication.
All data and scripts are organized by research theme into main files, facilitating the reproduction of study results and further investigation.
Files and variables
Overall data set file: 20260131_Yangtze_Fishing_Ban_Raw_data_and_R_code.zip
Note:
•.RData = Work Environment Archive (store objects)
•.RDataTmp = Temporary backup when auto-saving (can be ignored)
•.Rhistory = Command History (store code)
Folder1: Raw data
Description:
Name: Raw data.xlsx
This raw_data.xlsx contains the integrated raw field and laboratory data from a comprehensive ecological survey.
This is primary observed and measured data, not an output from analytical scripts.
The dataset includes the following sheets:
Sheet name: Sampling location
It includes Site ID, Longitude, and Latitude for each site.
Sheet name: Abundance (ind)
It includes Site ID, Netting area, Time (YYYYMM, the same as below), and abundance of each of the fish species collected from each site.
Sheet name: Biomass (g)
It includes Site ID, Netting area, Time, and biomass of each of the fish species collected from each site.
Sheet name: Fish Body Characteristics Index
It includes Time, Species name, Weight, Number (i.e., number of individuals; when the value is not 1, the total weight was recorded while no individual total length was recorded as we only measured some of the collected samples), Mean weight, and Total length of fish species collected.
Sheet name: Average Fulton Index
It includes Site ID, Time, and Fulton index for fish species collected.
A condition factor calculated as K = (W / L³) * 100, where W is weight and L is length
Sheet name: Porpoise and Chinese sturgeon
Records for the presence/abundance of key protected species, i.e., the Yangtze Finless Porpoise and Chinese sturgeon, in different years.
Sheet name: Functional Traits Comparison
It includes Name (fish species), Code (fish species abbreviation), and functional traits (i.e., Population, History strategy, Maximum body length, Diet, Migration type, Lifespan.
Sheet name: Environmental factors
It includes Site ID, Time,
Water quality parameters:
- WT or water temperature, °C,
- TUR or turbidity, NTU,
- Cond or conductivity, μS/cm,
- DO or dissolved oxygen, mg/L,
- pH, pH unit,
- SD or Secchi Disk visibility, cm,
- Cl or chloride, mg/L
- TA or total hardness, mg/L,
- TP or total phosphorus, mg/L,
- PO4 or phosphate, mg/L
- TN or total nitrogen, mg/L
- NH4 or ammonia, mg/L
- NO3 or nitrate, mg/L
- NO2 or nitrite, mg/L
- Chla or chlorophyll a, μg/L
- TSS or total suspended solids, mg/L
Land use parameters:
- Water, %
- Forest, %
- Wetland, %
- Cropland, %
- Urban, %
- Bareland, %
- Clouds, %
- Grassland, %
Hydrology parameters:
- WL or water level, m
- Flow, m3/s
- MR or monthly runoff, m3
Climate parameters:
- MACP or monthly average cumulative precipitation, cm
- MAP or monthly average precipitation, cm
- MMAT or monthly maximum air temperature, °C
- MMIT or monthly minimum air temperature, °C
- MAT or monthly average air temperature, °C
- Navigation, number of shipping vessels
- Fishing, number of fishing boats
- Shoreline modification, %.
Folder2: Analysis of diversity and body length
Description:
This folder contains the complete R project workspace, raw data, processed datasets, analysis scripts, and generated results for the analysis of fish community diversity and body size metrics (length/weight).
The structure includes both input data and the outputs of statistical tests and visualizations, allowing for the full replication of the analytical workflow from raw data to final figures and statistical summaries.
Data.csv
Name: Abundance data.csv
The "Abundance data.csv" in the file represents a processed metric of species density, standardized to individuals per 10 square meters (ind/10m²), rather than a simple raw count. This standardization allows for direct comparison of species abundance across different sampling locations or efforts.
Name: Diversity.csv
- group1: Treatment or temporal state, indicating whether sampling occurred before ("Before") or after ("After") the implementation of an ecological intervention.
- group2: Survey year, indicating the calendar year in which the sampling was conducted (e.g., ranging from 2018 to 2023).
- group3: Seasonal or hydrological period (e.g., "Wet" for the wet season, "Dry" for the dry season).
- group4: Spatial zone within the study area, identifying the specific river reach (e.g., "Middle" for mid-reach, "Lower" for downstream reach).
- group5: Specific measure group under the before or after intervention conditions, representing finer sub-categorization based on the type or intensity of the applied measures.
- Richness: Species richness, representing the number of distinct fish species observed or estimated in the sample.
- Evenness: Species evenness index, quantifying the distribution of individual abundances among the species present in the community (e.g., Pielou's evenness).
- Total_fish_abundance: Total fish abundance, expressed as the number of individuals per 10 square meters (ind/10m²).
- Total_fish_biomass: Total fish biomass, representing the combined wet weight (g/10m²) of all fish collected in the sample.
- Large_fish_abundance: Abundance of large fish, expressed as the number of large individuals per 10 square meters (ind/10m²), based on a defined size threshold.
- Small_fish_abundance: Abundance of small fish, expressed as the number of small individuals per 10 square meters (ind/10m²), based on the same size threshold.
- Large_fish_biomass: Biomass of large fish, representing the total wet weight (g/10m²) contributed by all large fish in the sample.
- Small_fish_biomass: Biomass of small fish, representing the total wet weight (g/10m²) contributed by all small fish in the sample.
Name: group.csv
- group1: Survey year, indicating the calendar year in which the sampling was conducted (e.g., ranging from 2018 to 2023).
- group2: Seasonal or hydrological period (e.g., "Wet" for the wet season, "Dry" for the dry season).
Name: Mountain Peak Map.csv
- Time:Survey year, indicating the calendar year in which the sampling was conducted (e.g., ranging from 2018 to 2023).
- SP:Abbreviation or code representing fish species
- Weight:weight(g) for the recorded group or sample
- Total: Total length (mm)
R File (.R)
Name: Box plots
Diversity, abundance, and biomass index comparison before and after fishing ban
Name: Diversity box chart - Yearly changes
Comparison of diversity, abundance and biomass indices over the years
Name: Mountain Peak Map
Graph showing the body length of fish species
Name: Wilcoxon_cohen_results
Data sensitivity verification (comparing whether there are significant differences in the combination of data from different years)
Folder3: β diversity
Description:
This folder contains all data, scripts, and results specifically related to the analysis of beta diversity (β-diversity).
The workflow encapsulated here typically involves calculating dissimilarity indices (such as Bray-Curtis or Jaccard), visualizing compositional differences, and statistically testing whether compositional differences exist between predefined groups.
Data.csv
Name: Abundance data.csv
The "Abundance data.csv" in the file represents a processed metric of species density, standardized to individuals per 10 square meters (ind/10m²), rather than a simple raw count. This standardization allows for direct comparison of species abundance across different sampling locations or efforts.
The original names of the species can be found in Raw data (Abundance(ind)/Biomass(g))
Name: Diversity.csv
- group1: Treatment or temporal state, indicating whether sampling occurred before ("Before") or after ("After") the implementation of an ecological intervention.
- group2: Survey year, indicating the calendar year in which the sampling was conducted (e.g., ranging from 2018 to 2023).
- group3: Seasonal or hydrological period (e.g., "Wet" for the wet season, "Dry" for the dry season).
- group4: Spatial zone within the study area, identifying the specific river reach (e.g., "Middle" for mid-reach, "Lower" for downstream reach).
- group5: Specific measure group under the before or after intervention conditions, representing finer sub-categorization based on the type or intensity of the applied measures.
- Richness: Species richness, representing the number of distinct fish species observed or estimated in the sample.
- Evenness: Species evenness index, quantifying the distribution of individual abundances among the species present in the community.
- Total_fish_abundance: Total fish abundance, expressed as the number of individuals per 10 square meters (ind/10m²).
- Total_fish_biomass: Total fish biomass, representing the combined wet weight (g/10m²) of all fish collected in the sample.
- Large_fish_abundance: Abundance of large fish, expressed as the number of large individuals per 10 square meters (ind/10m²), based on a defined size threshold.
- Small_fish_abundance: Abundance of small fish, expressed as the number of small individuals per 10 square meters (ind/10m²), based on the same size threshold.
- Large_fish_biomass: Biomass of large fish, representing the total wet weight (g/10m²) contributed by all large fish in the sample.
- Small_fish_biomass: Biomass of small fish, representing the total wet weight (g/10m²) contributed by all small fish in the sample.
R File (.R)
Name: β-diversity calculation
Calculate the changes in β diversity before and after the fishing ban and plot it
Name:β-diversity comparison Wilcoxon
Sensitivity test of the β-diversity index (under different year combinations)
Folder4: Analysis of fish fat content
Description:
Contains R workspace files (.RData, .RDataTmp, .Rhistory) and data files for analyzing fish condition factor (S Condition_factor) and fish body fat content (F Fish fat content).
This file supports the study of fish physiological status and energy reserves.
Data.csv
Name: Condition_factor
The annual variations of Fulton's condition factor for fish species
R File (.R)
Name:Fish fat content
A box plot showing the annual changes of Fulton's condition factor for fish species
Folder5: Analysis of rare species
Description:
Contains R workspace files and analysis scripts focused on rare and endangered species in the Yangtze River, particularly the Yangtze finless porpoise and Chinese sturgeon.
The file includes data and scripts for analyzing annual population changes and conservation status of these species.
Data.csv
Name: Rare_species
- The number of rare species
- group1: Treatment or temporal state, indicating whether sampling occurred before ("Before") or after ("After") the implementation of an ecological intervention.
- group2: Survey year, indicating the calendar year in which the sampling was conducted (e.g., ranging from 2018 to 2023).
- Species include Acipenser dabryanus/Myxocyprinus asiaticus/Ochetobius elongatus
Name: Yangtze_finless_porpoise_and_Chinese_sturgeon
The number of Yangtze finless porpoises and Chinese sturgeons
Sheet1:Yangtze finless porpoises (ind)
Sheet2:Chinese sturgeon harvested (ind)
R File (.R)
Name: Annual Changes of Rare Species
Draw the annual change chart of the species
Folder6: Dominant species of a community
Description:
Contains R workspace files and scripts for analyzing and visualizing dominant species in the fish community, using the Index of Relative Importance (IRI).
The file includes scripts for calculating IRI values and generating heatmaps to display community dominance patterns.
Data.csv
Name:Heat map of community dominance
The changes of the main dominant species in the community from one year to another
R File (.R)
Name:Calculate and draw a heatmap for the dominant species of the community (IRI)
Calculate and draw a heatmap for the dominant species of the community (IRI)
Folder7: Environmental factor box plot
Description:
Contains R workspace files and scripts for analyzing environmental factors associated with the sampling data.
The file includes the dataset of environmental variables and scripts to generate box plots for visualizing the distribution of these factors across different groups or conditions.
Data.csv
Name:Environmental factor
Including data grouping, diversity, abundance, biomass data and environmental data (such as temperature, total nitrogen, total phosphorus, etc., Table S1)
R File (.R)
Name:Environmental box plot
Draw the annual change charts for each environmental factor
Folder8: River and sea migration species map
Description:
Contains R workspace files and scripts for creating spatial distribution maps of migratory fish species in the Yangtze River.
The file includes geographical coordinate data and scripts to visualize species abundance in relation to longitude and latitude.
Data.csv
Name:Sampling longitude and latitude of the Yangtze River
It includes different groups based on time period and river reach, site longitude and latitude coordinates, and biomass and abundance of four migration species.
Name: The longitude and latitude of the Yangtze River
The latitude and longitude coordinates of the Yangtze River
R File (.R)
Name:Project the species abundance on the latitude and longitude coordinates
Display the locations where different species were collected on the map
Folder9: Shape stacked column chart (abundance and weight)
Description:
Contains R workspace files and scripts for creating stacked column charts to visualize the composition of fish functional traits.
This analysis compares trait distributions based on two different metrics: species abundance (number of individuals) and weight (biomass), highlighting how the functional structure of the community may differ when measured by count versus mass.
Data.csv
Name: Abundance-based traits
Based on the data of abundance and traits
Code(Species abbreviation)/Population resilience/History strategy/Maximum body length/Diet/Migration type/Lifespan, and different years (2018 - 2023)
Name: Weight-based traits
Based on the data of Weight and traits
Code (Species abbreviation)/Population resilience/History strategy/Maximum body length/Diet/Migration type/Lifespan, and different years (2018 - 2023)
R File (.R)
Name: Stacked column chart
Draw stacked column charts of functional traits based on abundance and biomass data for different years
Folder10: STAMP species difference analysis
Description:
Contains R workspace files and data for performing STAMP analysis.
This method is used to identify statistically significant differences in species (or genus) composition between two or more groups of samples (e.g., before vs. after fishing ban, different river sections).
The analysis is conducted separately using relative abundance data (based on individual counts) and relative weight/biomass data.
Data.csv
Name:abundance
- Abundance-based species quantity table
- Sheet1=Species abundance data
- Sheet2=Data grouping
Name: weight
Weight-based species quantity table
Sheet1=Species Weight data
Sheet2=Data grouping
R File (.R)
Name: STAMP Analysis
STAMP Analysis is used to analyze the species whose abundance and biomass have undergone significant changes before and after the fishing ban.
Folder11: Structural equation
Description:
This folder contains the complete workflow for constructing piecewise Structural Equation Models (piecewiseSEM) to investigate the causal drivers of fish community changes following the fishing ban. The analysis operates at two levels:
1.Overall community level: Modeling pathways from environmental/human drivers to total abundance/biomass and diversity indices (Richness, Evenness, β).
2.Size-based functional group level: Separately modeling the responses of large fish and small fish (in terms of abundance and biomass) to explore size-dependent mechanisms.
The models use Generalized Least Squares (GLS) regressions to account for potential spatial autocorrelation or heteroscedasticity in the data.
Name: subfolder1: Structural Equation Model
Data.csv
Name: Structural equation
Data grouping information / Diversity, abundance, biomass index / Numerous environmental factors
R File (.R)
Name: Evenness GLS: Species evenness
Name: Richness GLS: Species richness
Name: Total fish abundance GLS: Total fish count (ind/10m²)
Name: Total fish biomass GLS: Total fish biomass (g/10m²)
Name: Large fish abundance GLS: Abundance of large fish (ind/10m²)
Name: Large fish biomass GLS: Biomass of large fish (g/10m²)
Name: Small fish abundance GLS: Abundance of small fish (ind/10m²)
Name: Small fish biomass GLS: Biomass of small fish (g/10m²)
The GLS approach is specifically used to account for potential violations of independence in the data (e.g., spatial autocorrelation or temporal pseudoreplication) by specifying appropriate correlation and variance structures.
The script Structural equation is the core piecewise SEM script. It integrates the outputs from all the individual GLS models specified above to test the overall causal network, evaluate direct and indirect effects among variables, and assess the model's goodness-of-fit.
Name: subfolder2: β diversity
Data.csv
Name: Abundance data
Data grouping and species abundance data
Name: Structural equation
Data grouping information / Diversity, abundance, biomass index / Numerous environmental factors.
R File (.R)
Name:β GLS
The code for drawing the structural equation model is the same as that for other diversity indices.
Code/software
- The data files (XLSX/XLS) can be viewed by Excel.
- The code files (R file) can be viewed and run by R software (R 4.5.1).
