Data from: Seasonal differences in fish community structure in the upper Yangtze river based on eDNA metabarcoding: A multi-dimensional analysis
Data files
Oct 13, 2025 version files 535.49 KB
-
Calculation_Data(FD).xlsx
57.22 KB
-
Calculation_Data(FGD).xlsx
14.18 KB
-
Calculation_Data(SPD).xlsx
46.12 KB
-
Calculation_Data(TD___PD).xlsx
46.99 KB
-
Data_Figure_2.xlsx
14.73 KB
-
Data_Figure_3.xlsx
10.03 KB
-
Data_Figure_4.xlsx
19.27 KB
-
Data_Figure_5_a.xlsx
9.83 KB
-
Data_Figure_5_b.xlsx
10.02 KB
-
Data_Figure_6.xlsx
17.07 KB
-
Data_Figure_7_a.xlsx
44.92 KB
-
Data_Figure_7_b.xlsx
12.86 KB
-
Data_Figure_7_group.xlsx
9.98 KB
-
Data_Figure_8.xlsx
20.10 KB
-
Data_Figure_9_a_environment.xlsx
11.94 KB
-
Data_Figure_9_a.xlsx
45.02 KB
-
Data_Figure_9_b.xlsx
13.13 KB
-
DCA_Figure_9_a.xlsx
9.24 KB
-
DCA_Figure_9_b.xlsx
9.29 KB
-
DCA.R
376 B
-
Function_Diversity_Calculation.R
653 B
-
OTU.xlsx
48.74 KB
-
R_Code_Figure_2.R
961 B
-
R_Code_Figure_3.R
553 B
-
R_Code_Figure_4.R
1.36 KB
-
R_Code_Figure_5_a.R
557 B
-
R_Code_Figure_5_b.R
560 B
-
R_Code_Figure_6.R
1.27 KB
-
R_Code_Figure_7.R
3.13 KB
-
R_Code_Figure_8.R
4.54 KB
-
R_Code_Figure_9_a.R
4.43 KB
-
R_Code_Figure_9_b.R
4.38 KB
-
RDA_Figure_9_a.xlsx
10.69 KB
-
RDA_Figure_9_b.xlsx
10.76 KB
-
README.md
20.01 KB
-
α_Diversity_Calculation.R
570 B
Abstract
This study used environmental DNA (eDNA) metabarcoding to monitor fish communities at 14 sites along the upper Yangtze River across four seasons, aiming to understand seasonal variations in community structure. This study reflects on the current status of fish community structure in the region by leveraging an analytical discussion on the composition of fish species and functional groups, the application of 16 multidimensional diversity indices, and the relationships between fish communities and environmental variables. A total of 120 fish species were detected. The communities predominantly consisted of species that produce sticky eggs, are sedentary, and have omnivorous diets. However, the relative abundance of species producing drifting eggs, migratory, and carnivore fish groups was higher in autumn and winter compared to spring and summer. Except for the functional evenness index (FEve), all other functional diversity indices, as well as α diversity, taxonomic diversity, and phylogenetic diversity indices, were higher in spring and summer compared to autumn and winter. This suggests that fish diversity is greater in spring and summer; however, ecological niche overlap is more pronounced during these seasons. Multiple diversity indices demonstrated a high degree of correlation among them. This study provides practical experience for fish monitoring based on eDNA metabarcoding technology.
https://doi.org/10.5061/dryad.rn8pk0pnw
Description of the data and file structure
Dataset description
This dataset includes fish environmental DNA data collected in the upper Yangtze River from 2021 to 2022 across all four seasons, as well as intermediate data and some analysis code used in the article.
File descriptions
File: OTU.xlsx
Format: .xlsx
Description: This file contains the fundamental data for this study, supporting all analyses presented in the article.
- Order: The taxonomic order of the species.
- Family: The taxonomic order of the species.
- Genus: The taxonomic genus of the species.
- Species: The full scientific name (binomial) of the species.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each species row typically represent the number of sequence reads assigned to that species in each respective sample.
File: Data_Figure_2.xlsx
Format: .xlsx
Description: Data used to create Figure 2: Venn diagram for the four seasons.
- Species: Scientific name of the species unit.
- Spring: Total sequence read abundance for spring.
- Summer: Total sequence read abundance for summer.
- Autumn: Total sequence read abundance for autumn.
- Winter: Total sequence read abundance for winter.
File: Data_Figure_3.xlsx
Format: .xlsx
Description: Data used to create Figure 3: Stacked bar chart showing the relative sequence abundance of the top 10 species across four seasons.
- Species: The scientific name of the species unit
- Group: The seasonal group to which the abundance value belongs.
- Abundance: The sequence read abundance value for the corresponding species in the specified season.
File: Data_Figure_4.xlsx
Format: .xlsx
Description: Data used to create Figure 4: Clustering tree of fish functional groups. The first worksheet "traits" contains the functional traits of each fish species, and the second worksheet contains the grouping of functional groups with 80% similarity.
- Species: The scientific name of the species unit
- Egg type: The reproductive strategy based on egg type and properties.
- Migration type: The migratory behavior of the species.
- Diet: The primary feeding habit and trophic level.
- Vertical distribution: The primary water column stratum inhabited by the species.
- Flow velocity preference: The preferred hydraulic habitat in terms of flow speed.
- Body shape: The morphological body form of the species.
- Mouth position: The orientation and position of the mouth.
- Minimum age of sexual maturity(♂): The earliest age at which males of the species typically reach sexual maturity.
- Minimum age of sexual maturity(♀): The earliest age at which females of the species typically reach sexual maturity.
- Group: The functional group (e.g., G1, G2, G3, ... G10)
File: Data_Figure_5_a.xlsx
Format: .xlsx
Description: Data used to create Figure 5a: Stacked bar chart of the composition of the top 10 functional groups across four seasons based on species count.
- Function.Group: The identifier for the functional group (G1 through G10).
- Group: The seasonal period during which the species count was recorded.
- Number: The number of species belonging to the corresponding functional group (Function.Group) in the specified season (Group).
File: Data_Figure_5_b.xlsx
Format: .xlsx
Description: Data used to create Figure 5b: Stacked bar chart of the composition of the top 10 functional groups across four seasons based on sequence abundance.
- Function.Group: The identifier for the functional group (G1 through G10).
- Group: The seasonal period during which the species count was recorded.
- Number: The cumulative sequence read abundance (e.g., the sum of reads for all species within that group) for the corresponding functional group (Function.Group) in the specified season (Group).
File: Data_Figure_6.xlsx
Format: .xlsx
Description: Data used to create Figure 6: Boxplot showing seasonal differences in diversity indices.
- Group: The seasonal group to which the sample belongs.
- SP_Margalef: Margalef's Richness Index. A measure of species richness that is adjusted for sample size. Higher values indicate a greater number of species in the sample.
- SP_Shannon: Shannon-Wiener Diversity Index. A measure of species diversity that incorporates both species richness and the evenness of their abundances. Higher values indicate greater, more balanced diversity.
- SP_Simpson: Simpson's Diversity Index. A measure of diversity that represents the probability that two randomly selected individuals belong to different species. Higher values indicate higher diversity.
- SP_Pielou: Pielou's Evenness Index. A measure of how evenly individuals are distributed among the different species present. It ranges from 0 to 1, where 1 indicates a completely even distribution.
- Delta: Average Taxonomic Distinctness (Δ+). A measure of taxonomic diversity that reflects the average phylogenetic path length between any two randomly chosen species in the community. It represents the average taxonomic distance among species.
- Lambda:Variation in Taxonomic Distinctness (Λ+). A measure of taxonomic diversity that reflects the evenness of the taxonomic tree structure. It indicates whether the taxonomic relationships among species are clustered or evenly distributed across different taxonomic levels.
- FRic: Functional Richness. The volume of the functional space occupied by the community, representing the amount of niche space filled. Values range from 0 to 1.
- FDiv: Functional Divergence. The degree to which species abundances are distributed toward the extremes of the functional space. High FDiv indicates that abundant species have unique functional traits.
- FEve: Functional Evenness. The regularity of species abundances and their distribution in the functional space. It measures how evenly the species fill the functional space. Values range from 0 to 1.
- FDis: Functional Dispersion. The mean distance of individual species to the centroid of all species in the functional trait space, representing multivariate trait dispersion.
- RaoQ: Rao's Quadratic Entropy. A functional diversity index that incorporates both species' relative abundances and their pairwise functional differences.
- PD: Phylogenetic Diversity. The sum of the lengths of all phylogenetic branches connecting the set of species in a community, representing their shared evolutionary history.
- FDG_Margalef: Functional Group Margalef's Index. Margalef's richness index applied at the functional group (FG) level, measuring the richness of functional groups.
- FDG_Pielou: Functional Group Pielou's Evenness Index. Pielou's evenness index applied at the functional group level, measuring the evenness of distribution across functional groups.
- FDG_Shannon: Functional Group Shannon-Wiener Index. The Shannon-Wiener diversity index applied at the functional group level, measuring functional group diversity considering both richness and evenness.
- FDG_Simpson: Functional Group Shannon-Wiener Index. The Shannon-Wiener diversity index applied at the functional group level, measuring functional group diversity considering both richness and evenness.
File: Data_Figure_7_a.xlsx
Format: .xlsx
Description: Data used to create Figure 7a: NMDS analysis results of fish species composition across seasons.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each species row typically represent the number of sequence reads assigned to that species in each respective sample.
File: Data_Figure_7_b.xlsx
Format: .xlsx
Description: Data used to create Figure 7b: NMDS analysis results of functional group composition across seasons.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each function group row typically represent the number of sequence reads assigned to that function group in each respective sample.
File: Data_Figure_7_group.xlsx
Format: .xlsx
Description: Data used to group samples by season.
- sample: The unique sample name.
- group:Seasonal grouping of samples.
File: Data_Figure_8.xlsx
Format: .xlsx
Description: Data used to create Figure 8: Pearson correlation matrix of various diversity indices and environmental factors.
- SP_Margalef: Margalef's Richness Index. A measure of species richness that is adjusted for sample size. Higher values indicate a greater number of species in the sample.
- SP_Shannon: Shannon-Wiener Diversity Index. A measure of species diversity that incorporates both species richness and the evenness of their abundances. Higher values indicate greater, more balanced diversity.
- SP_Simpson: Simpson's Diversity Index. A measure of diversity that represents the probability that two randomly selected individuals belong to different species. Higher values indicate higher diversity.
- SP_Pielou: Pielou's Evenness Index. A measure of how evenly individuals are distributed among the different species present. It ranges from 0 to 1, where 1 indicates a completely even distribution.
- Delta: Average Taxonomic Distinctness (Δ+). A measure of taxonomic diversity that reflects the average phylogenetic path length between any two randomly chosen species in the community. It represents the average taxonomic distance among species.
- Lambda:Variation in Taxonomic Distinctness (Λ+). A measure of taxonomic diversity that reflects the evenness of the taxonomic tree structure. It indicates whether the taxonomic relationships among species are clustered or evenly distributed across different taxonomic levels.
- FRic: Functional Richness. The volume of the functional space occupied by the community, representing the amount of niche space filled. Values range from 0 to 1.
- FDiv: Functional Divergence. The degree to which species abundances are distributed toward the extremes of the functional space. High FDiv indicates that abundant species have unique functional traits.
- FEve: Functional Evenness. The regularity of species abundances and their distribution in the functional space. It measures how evenly the species fill the functional space. Values range from 0 to 1.
- FDis: Functional Dispersion. The mean distance of individual species to the centroid of all species in the functional trait space, representing multivariate trait dispersion.
- RaoQ: Rao's Quadratic Entropy. A functional diversity index that incorporates both species' relative abundances and their pairwise functional differences.
- PD: Phylogenetic Diversity. The sum of the lengths of all phylogenetic branches connecting the set of species in a community, representing their shared evolutionary history.
- FDG_Margalef: Functional Group Margalef's Index. Margalef's richness index applied at the functional group (FG) level, measuring the richness of functional groups.
- FDG_Pielou: Functional Group Pielou's Evenness Index. Pielou's evenness index applied at the functional group level, measuring the evenness of distribution across functional groups.
- FDG_Shannon: Functional Group Shannon-Wiener Index. The Shannon-Wiener diversity index applied at the functional group level, measuring functional group diversity considering both richness and evenness.
- FDG_Simpson: Functional Group Shannon-Wiener Index. The Shannon-Wiener diversity index applied at the functional group level, measuring functional group diversity considering both richness and evenness.
- Dep: Depth(m)
- FV: Flow Velocity(m/s)
- WT: Water Temperature(°C)
- pH: pH
- Con: Conductivity(μS/cm)
- DO: Dissolved Oxygen(mg/L)
File: Data_Figure_9_a.xlsx
Format: .xlsx
Description: Data used to create Figure 9a: Fish composition for RDA analysis.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each species row typically represent the number of sequence reads assigned to that species in each respective sample.
File: Data_Figure_9_b.xlsx
Format: .xlsx
Description: Data used to create Figure 9b: Functional group composition for RDA analysis.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each function group row typically represent the number of sequence reads assigned to that function group in each respective sample.
File: Data_Figure_9_a_environment.xlsx
Format: .xlsx
Description: Environmental factor data used for RDA analysis.
- Dep: Depth(m)
- FV: Flow Velocity(m/s)
- WT: Water Temperature(°C)
- pH: pH
- Con: Conductivity(μS/cm)
- DO: Dissolved Oxygen(mg/L)
File: DCA_Figure_9_a.xlsx
Format: .xlsx
Description: DCA analysis results for fish composition data (Data Figure 9a.xlsx).
File: DCA_Figure_9_b.xlsx
Format: .xlsx
Description: DCA analysis results for functional group composition data (Data Figure 9b.xlsx).
File: RDA_Figure_9_a.xlsx
Format: .xlsx
Description: RDA analysis results for fish composition data (Data Figure 9a.xlsx) and environmental factor data (Data Figure 9a environment.xlsx).
File: RDA_Figure_9_b.xlsx
Format: .xlsx
Description: RDA analysis results for functional group composition data (Data Figure 9b.xlsx) and environmental factor data (Data Figure 9a environment.xlsx).
File: Calculation_Data(FD).xlsx
Format: .xlsx
Description: Data used to calculate functional diversity.
abun
This sheet shows the abundance of each fish species in every collected sample.
trait
This sheet describes the functional characteristics of each fish species. All columns except the last two contain binary (presence/absence) data, where "1" represents "YES" (the species has the trait) and "0" represents "NO" (the species does not have the trait).
- Sticky egg: A type of Egg Type.
- Drifting egg: A type of Egg Type.
- Floating egg: A type of Egg Type.
- demersal egg: A type of Egg Type.
- Other: A type of Egg Type.
- Migration: A type of Migration Type.
- settle: A type of Migration Type.
- Carnivore: A type of Diet.
- Omnivorous: A type of Diet.
- Filter feeding ability: A type of Diet.
- phytophagous: A type of Diet.
- Demersal: A type of Vertical Distribution.
- Upper middle: A type of Vertical Distribution.
- Lower Middles: A type of Vertical Distribution.
- Quiet and slow flowing water: A type of Flow Velocity Preference.
- flowing water: A type of Flow Velocity Preference.
- eurytropy: A type of Flow Velocity Preference.
- Spindle shaped: A type of Body Shape.
- depressiform: A type of Body Shape.
- compressiform: A type of Body Shape.
- cylinder: A type of Body Shape.
- oval: A type of Body Shape.
- Upper port: A type of Mouth Position.
- Sub upper port: A type of Mouth Position.
- Terminal port: A type of Mouth Position.
- Subinferior orifice: A type of Mouth Position.
- Inferior orifice: A type of Mouth Position.
- Minimum age of sexual maturity(male): A continuous variable indicating the minimum age (in years) at which male individuals of the species reach sexual maturity.
- Minimum age of sexual maturity(female): A continuous variable indicating the minimum age (in years) at which female individuals of the species reach sexual maturity.
File: Calculation_Data(FGD).xlsx
Format: .xlsx
Description: Data used to calculate functional group α-diversity.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each function group row typically represent the number of sequence reads assigned to that function group in each respective sample.
File: Calculation_Data(SPD).xlsx
Format: .xlsx
Description: Data used to calculate species α-diversity.
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each species row typically represent the number of sequence reads assigned to that species in each respective sample.
File: Calculation_Data(TD___PD).xlsx
Format: .xlsx
Description: Data used to calculate taxonomic diversity and phylogenetic diversity.
station
- Species: Species names
- Sample Name Columns, e.g., XZ1, JA1……, ZG4, etc.: The subsequent columns in this data table are named after individual environmental DNA (eDNA) samples. The values in these columns under each species row typically represent the number of sequence reads assigned to that species in each respective sample.
taxon
- Species: The full scientific name (binomial) of the species.
- Genus: The taxonomic genus of the species.
- Family: The taxonomic order of the species.
- Order: The taxonomic order of the species.
weight
- taxon: Taxonomic levels included
- Branch: Branch number
- weight: Weight values assigned to each taxonomic level
File: R_Code_Figure_2.R
Format: .R
Description: R code used to create Figure 2: Venn diagram for the four seasons.
File: R_Code_Figure_3.R
Format: .R
Description: R code used to create Figure 3: Stacked bar chart showing the relative sequence abundance of the top 10 species across four seasons.
File: R_Code_Figure_4.R
Format: .R
Description: R code used to create Figure 4: Fish functional group clustering tree.
File: R_Code_Figure_5_a.R
Format: .R
Description: R code used to create Figure 5a: Stacked bar chart of the composition of the top 10 functional groups based on species count across four seasons.
File: R_Code_Figure_5_b.R
Format: .R
Description: R code used to create Figure 5b: Stacked bar chart of the composition of the top 10 functional groups based on sequence abundance across four seasons.
File: R_Code_Figure_6.R
Format: .R
Description: R code used to create Figure 6: Boxplot showing seasonal differences in diversity indices.
File: R_Code_Figure_7.R
Format: .R
Description: R code used to create Figure 7a: NMDS analysis results for fish species composition across seasons.
File: R_Code_Figure_8.R
Format: .R
Description: R code used to create Figure 8: Pearson correlation matrix.
File: R_Code_Figure_9_a.R
Format: .R
Description: R code used to create Figure 9a: Fish composition RDA analysis.
File: R_Code_Figure_9_b.R
Format: .R
Description: R code used to create Figure 9b: Functional group composition RDA analysis.
File: α_Diversity_Calculation.R
Format: .R
Description: R code used to calculate α-diversity indices.
File: DCA.R
Format: .R
Description: R code used to perform DCA analysis.
File: Function_Diversity_Calculation.R
Format: .R
Description: R code used to calculate functional group diversity.
Data collection and methodology
The data was collected using standard fish environmental DNA species survey methods. All data was collected by trained researchers at sampling sites in the upper Yangtze River.
Required software and tools
- R 4.3.1 or higher, for data analysis and visualization.
- Excel 2016 or higher, for processing data files.
- PRIMER 7 for calculating taxonomic and phylogenetic diversity.
