The impact of urbanization on avian communities during the breeding season in the Huanghuai Plain of China
Data files
Oct 03, 2025 version files 36.47 KB
-
original_data.csv
26.04 KB
-
R_Code.Rmd
3.53 KB
-
README.md
6.90 KB
Abstract
This dataset was collected during the 2022–2023 breeding seasons in the Huanghuaihai Plain, China, to assess the impact of urbanization on bird communities. It includes species occurrence and abundance records across multiple line transects, as well as calculated diversity indices including species richness, Shannon-Wiener index, Simpson index, and Pielou’s evenness. Environmental variables were recorded for each transect, including building index (BI), environmental noise (EN), human disturbance index (DI), and distance to county center (DCC), which were combined to form an urbanization synthetic index (USI). The dataset also incorporates five species-level ecological and life-history traits: body mass (g), diet category, clutch size, nest site, and provincial distribution. Analyses of these data indicate significant differences in species diversity and functional traits among urban, suburban, and rural habitats. USI was negatively correlated with species richness and Shannon-Wiener diversity, but showed no significant relationship with functional traits. Environmental noise, distance to county center, and the proportion of buildings within a 250-m radius were identified as key factors influencing species diversity, while environmental noise and distance to county center were the strongest predictors of functional traits. These data can be reused to examine the effects of urbanization on avian community composition and diversity, to compare biodiversity patterns across regions, or to inform conservation and urban planning strategies.
Date of data collection: from June to August in 2022 and 2023
Location: 150 line transects across the Huanghuai Plain, with 50 line transects allocated to each habitat (urban, suburban, rural).
All the data files were created on December 30, 2024.
original_data.csv: all data employed in this experiment, including Shannon-Wiener diversity index, Simpson diversity, species richness, Pielou evenness index, building index (four scales), environmental noise, disturbance index, the distance to county center, urbanization synthetic index, and site.
R_Code.Rmd: R codes for "The impact of urbanization on avian communities during the breeding season in the Huanghuai Plain of China", there are codes for the data analysis in this dataset. The same analysis is presented only once to serve as an example for reference.
See below for definitions of variables.
Description of methods for data collection and data processing:
Bird surveys were conducted by two experienced researchers during 4 h from dawn and 3 h before sunset in good weather (e.g., no wind and rain) in summer during June to August 2022. The observers walked along each line transect at a constant speed of approximately 1.0km/h – 2.0km/h and used binoculars for direct observation to identify birds, and additionally, observers also utilized camera to document bird that were unidentifiable within a 50-m radius, while not including those flying over the head. During June to August 2023, we carried out repeat bird surveys by using the same methods. The composition of land use types of each line transect did not change during our surveys.
The identification and classification of birds based on A Checklist on the Classification and Distribution of Birds of China and A Field Guide to the Birds of China (Mackinnon et al., 2000). The levels of endangerment and conservation status are based on The List of National Key Protected Wildlife, The IUCN Red List of Threatened Species (IUCN; https://www.iucnredlist.org/), and The Red List of Biodiversity in China: Vertebrates.
We selected 5 ecological and life history characteristics (i.e., body mass, diet, clutch size, nest site, and distributed provinces) for our study. See below for definitions of variables.
1.Body mass
Definition: The average body weight of the bird species.
Unit: grams (g).
2.Diet
Definition: The primary feeding guild or dietary category of the species. Indicates the main food resources consumed by the species.
Unit: categorical (i.e., insectivorous, carnivorous, omnivorous, insectivorous&carnivorous).
3.Clutch size
Definition: The typical number of eggs laid per breeding attempt. Reflects the reproductive capacity of the species.
Unit: number of eggs (count).
4.Nest site
Definition: The typical nesting location used by the species.Describes where birds usually build their nests.
Unit: categorical (e.g., ground, shrubbery, crown, water, rock-wall).
5.Distributed provinces
Definition: The provinces or administrative regions where the species is recorded within China.Provides the known geographic distribution at the provincial level.
Unit: text list of provinces.
We imported all line transects into ArcGIS 10.8 and established four buffer zones around each line transect, with a radius of 250m, 500m, 1000m, and 2000m, respectively, to estimate the proportion of buildings surrounding the lines. And we used the method of summing the weighted building proportions to calculate the building index: The Building Index (BI) = 250 m of building area%1 + 500 m of building area%0.5 + 1000 m of building area% * 0.25 + 2000 m of building area% 0.125.
While conducting field surveys along each line transect, we measured the environmental noise using a decibel meter. Measurements were conducted once during each of the three time periods: morning, midday, and evening, with each measurement lasting for 10 minutes. The average of the noise values obtained from each line transect was then taken as the environmental noise. Meanwhile, we also carried out the collection of the disturbance index. The observation at each line transect, human traffic was recorded, with each observation lasting for 10 minutes. The average value was then adopted. The study divided human disturbance into five levels, with level 1 indicating the absence of human; Level 2 indicating human traffic of 1-2 people per minute; Level 3 indicating human traffic of 3-7 people per minute; Level 4 indicating human traffic of 8-17 people per minute; and Level 5 indicating human traffic of 18 people per minute or more. Our study divided the study area into county-level units, with the government hall at each county serving as the central point. The distance to the county center (DCC) was measured as the straight-line from line transects to the government hall of each county (km) with Google map.
Following other studies, an urbanization synthetic index (USI) for our study is as follows: USI= BI100 / 2 + EN + DI*20 + 100 / DCC. We adjusted parameter values to the range of 0-100, with higher numbers representing a higher level of urbanization. In detail, the BI value ranged from 0 to 2, we standardized it by multiplying by 100 and then dividing by 2; The EN value ranged from 0 to 100, so it remained unchanged; The DI value ranged from 1-5, so it was multiplied by 20; the DCC value from 0 to 60, so it was taken the reciprocal and multiplied 100.
For each line transect, we calculated the species richness, abundance of every species, Shannon-Wiener diversity index, Pielou evenness index, and Simpson diversity index.
We then conducted linear mixed models (LMMs) to explore the relationship between the urbanization synthetic index and bird species diversity, with the USI as the fixed effect and the research sites as a random factor.
This experiment also employed linear mixed models to determine which factors in the USI (i.e., BI, EN, DI, DCC) influenced species diversity and their relative importance, with the study sites served as the random factor. We employed the Akaike Information Criterion (AIC) to determine which of the building index proportions within four different radii had the greatest impact, and we integrated the lowest AIC value into linear mixed models for further analyses. The analysis yielded a series of models. We then selected and ranked models basing on the cumulative difference values of the Akaike Information Criterion (ΔAICc ≤ 2), the candidate models selected were equivalent. Subsequently, model averaging was conducted to obtain the relative importance of each parameter within a 95% confidence interval, along with the model estimates and standard errors, in order to mitigate uncertainty in model selection.
