Data from: Insects in the city: Determinants of a contained aquatic microecosystem across an urbanized landscape
Data files
Jun 22, 2023 version files 241.66 KB
-
clean_data.csv
-
compiled_raw_data.csv
-
invert_biomass.csv
-
inverts_clean.csv
-
README.md
-
species_composition.csv
Dec 22, 2023 version files 241.37 KB
-
clean_data.csv
-
compiled_raw_data.csv
-
invert_biomass.csv
-
inverts_clean.csv
-
README.md
-
species_composition.csv
Abstract
Cities can have profound impacts on ecosystems, yet our understanding of these impacts is currently limited. First, the effects of socioeconomic dimensions of human society are often overlooked. Second, correlative analyses are common, limiting our causal understanding of mechanisms. Third, most research has focused on terrestrial systems, ignoring aquatic systems that also provide important ecosystem services. Here we compare the effects of human population density and low-income prevalence on the macroinvertebrate communities and ecosystem processes within water-filled artificial tree holes. We hypothesized that these human demographic variables would affect tree holes in different ways via changes in temperature, water nutrients, and the local tree hole environment. We recruited community scientists across Greater Vancouver (Canada) to provide host trees and tend 50 tree holes over 14 weeks of colonization. We quantified tree hole ecosystems in terms of aquatic invertebrates, litter decomposition, and chlorophyll-a. We compiled potential explanatory variables from field measurements, satellite images, or census databases. Using structural equation models, we showed that invertebrate abundance was affected by low-income prevalence but not human population density. This was driven by cosmopolitan species of Ceratopogonidae (Diptera) with known associations to anthropogenic containers. Invertebrate diversity and abundance were also affected by environmental factors, such as temperature, elevation, water nutrients, litter quantity, and exposure. By contrast, invertebrate biomass, chlorophyll-a, and litter decomposition were not affected by any measured variables. In summary, this study shows that some urban ecosystems can be largely unaffected by human population density. Our study also demonstrates the potential of using artificial tree holes as a standardized, replicated habitat for studying urbanization. Finally, by combining community science and urban ecology, we were able to involve our local community in this pandemic research pivot.
This abstract is quoted from the original article "Insects in the city: Determinants of a contained aquatic microecosystem across an urbanized landscape" in Ecology (2023) by DS Srivastava et al.
README: Title of Dataset:
Data from: "Insects in the city: Determinants of a contained aquatic microecosystem across an urbanized landscape""
Description of the Data and file structure
This readme file describes the (1) R scripts and (2) datafiles included in this repository. All scripts use R version 4.2.0 (2022-04-22 ucrt) and detailed package information is found at the end of scripts under #Session Info. Missing data in data files are indicated as NA. All data files are in .csv format meaning that , is used as the separator.
(1) Scripts
01_invert_biomass_script.R:
this script converts invertebrate abundances into biomasses using the length:mass relationships and average per capit mass relationships embedded/predicted by the hellometry R package.
Input: inverts_clean.csv
Output: invert_biomass.csv
02_data_imputation_compilation.R:
this script imputes some missing numeric environmental data, and then combines the full environmental dataset with invertebrate summary metrics.
Input: compiled_raw_data.csv, inverts_clean.csv, invert_biomass.csv
Output: clean_data.csv, species_composition.csv
03_dag_piecewise.R:
this script conducts several piecewise structural equation models, and outputs the path coefficients as well as local and global tests of significance.
Input: clean_data.csv
Output: Summary_table_est.rtf, Summary_table_est.csv
04_plots_tables.R:
the script carries on from the 03 script, creating the PCA biplots and partial residual plots for the manuscript.
Input: 03_dag_piecewise.R objects
Output: PCA_local_variables.pdf, resfig.jpeg, resfig.pdf, Table_summary_SEM.pdf, Summary_table_est.csv, Table_fit_SEM.pdf, Summary_table_fit.csv
(2) Data files
Variables in inverts_biomass.csv
Treehole.Number: a unique 2 digit code for each tree hole, synonymous with the variable treehole_id in other files (character variable)
Order: Taxonomic order, if known, of the invertebrate morphospecies
Family: Taxonomic family, if known, of the invertebrate morphospecies
Sub.family: Taxonomic subfamily, if known, of the invertebrate morphospecies
Genus: Taxonomic genus, if known, of the invertebrate morphospecies
species: Code for the invertebrate morphospecies
status: alive or dead
size: the body length class of the invertebrate, either as a category (small, medium, large, unknown) or numerical (in millimeters in all cases)
abundance: number of individual invertebrates corresponding to a particular combination of Treehole.Number, size, species, stage and status
drymass.mg: predicted dry mass of invertebrate individual, obtained from hellometry package based on species taxonomy, length and stage (see inverts_clean.csv)
Variables in inverts_clean.csv
Treehole.Number: a unique 2 digit code for each tree hole, synonymous with the variable treehole_id in other files (character variable)
Order: Taxonomic order, if known, of the invertebrate morphospecies
Family: Taxonomic family, if known, of the invertebrate morphospecies
Sub.family: Taxonomic subfamily, if known, of the invertebrate morphospecies
Genus: Taxonomic genus, if known, of the invertebrate morphospecies
species: Code for the invertebrate morphospecies
status: alive or dead
size: the body length class of the invertebrate, either as a category (small, medium, large, unknown) or numerical (in millimeters in all cases)
abundance: number of individual invertebrates corresponding to a particular combination of Treehole.Number, size, species, stage and status
Species: Latin binomial of the invertebrate morphospecies, if known from DNA barcoding (Ceratopogonidae sp. A = BOLD:ACR0360 , Ceratopogonidae sp. C =BOLD:AAN5169, Clogmia albipunctata = BOLD:AAE5173). If identification to species level was not possible, a morphospecies code is assigned starting with the known family or order and ending in a capital letter (e.g. Cecidomyiidae sp. A).
Stage: stage that the aquatic invertebrate was sampled, either larva or pupa
Variables in species composition.csv:
This file has 18 columns, 47 rows. Variables are treehole_id (a unique 2 digit code for each tree hole) and abundance of each of 17 morphospecies (these 17 column names are the 17 morphospecies listed in the species variable in inverts_clean.csv).
Variables in compiled_raw_data.csv, clean_data.csv:
treehole_id: a unique 2 digit code for each tree hole (character variable)
water_body: presence of freshwater within 500m of tree hole, noted on Google Earth (binary: 0 = absence, 1 = presence)
standing_water: sources of persistent standing water (>400 ml water), within 30m of tree holes, (binary: 0 = absence, 1 = presence)
log_circumference_cm: circumference (units: cm, log transformed) of host tree at ca. 1.3m aboveground
pH: water pH of treehole measured using a calibrated Oakton pH 450 pH meter, experiment end
canopy_sqrt: canopy openness as percentage of sky not obscured, square root transformed
volume_ml: final water volume in treehole (ml)
log_water_volume_ml: final water volume in treehole (ml, log-transformed)
turbidity: tree hole water turbidity at end of experiment, with a portable fluorometer
log_litter_available_g: loose litter in the tree hole at end of experiment, dried (units = g, log-transformed)
percent_low_income: proportion of the 18-64 year old population in a low-income household, defined as a household in the lower quartile of adjusted (i.e. multiplied by square root of household size) household after-tax income, extracted at level of dissemination area from 2016 Canada Census
pop_dens_km2: human residents per square km, extracted at level of dissemination area from 2016 Canada Census
elevation_m: elevation (m a.s.l.) extracted from latitude-longitude location on Geogle Earth
green_gray_ratio_500m: ratio of the area of impermeable grey surfaces (e.g. roads, buildings) from permeable green surfaces (e.g. forests, fields, lawns, parks) in a 500m circle centered on tree hole, from Google Earth
green_gray_ratio_100m: ratio of the area of impermeable grey surfaces (e.g. roads, buildings) from permeable green surfaces (e.g. forests, fields, lawns, parks) in a 100m circle centered on tree hole, from Google Earth
circumference_cm: circumference (units: cm) of host tree at ca. 1.3m aboveground
latitude: latitude of tree hole, in decimal degree notation, North is positive
Longitude: longitude of tree hole, in decimal degree notation, East is positive
T_phmeter_C: temperature recorded by the pH meter at experiment midpoint
avg_min_temp_C: mean daily minimum temperature (unit = degrees C), measured every hour for first 85 days of experiment, mix of raw and kriged values as defined by kriged variable
avg_mean_temp_C: mean daily mean temperature (unit = degrees C), measured every hour for first 85 days of experiment, mix of raw and kriged values as defined by kriged variable
avg_max_temp_C: mean daily maximum temperature (unit = degrees C), measured every hour for first 85 days of experiment, mix of raw and kriged values as defined by kriged variable
kriged_min_var: variation in kriged value for avg_min_temp_C (in degrees C, 0 if data raw and not kriged)
kriged_mean_var: variation in kriged value for avg_mean_temp_C (in degrees C, 0 if data raw and not kriged)
kriged_max_var: variation in kriged value for avg_max_temp_C (in degrees C, 0 if data raw and not kriged)
kriged: temperature data in tree hole is either raw as collected by iButtonTM temperature loggers (Maxim Integrated, San Jose, CA, USA; models DS1921G, DS1921Z, and DS1922L) (kriged = 0) or kriged based on latitude, longitude and elevation (kriged =1).
mean_po4_umolL: at end of experiment, concentration in tree hole water of PO4 (-) in micromoles per liter
mean_nh4_umolL: at end of experiment, concentration in tree hole water of NH4 (+) in micromoles per liter
mean_no2_umolL: at end of experiment, concentration in tree hole water of NO2 (-) in micromoles per liter
mean_no3_umolL: at end of experiment, concentration in tree hole water of NO3 (-) in micromoles per liter
chlorophyll_ugL: at midpoint of experiment, concentration in tree hole water of chlorophyll-a in micrograms per liter, as determined with a Trilogy Laboratory Fluorometer following Wasmund et al. (2006). Water was first filtered it through a glass microfiber (0.7 micrometer), frozen and then extracted on filters with 90% acetone
fine_litter_g: remaining litter in fine 0.5 mm mesh litter bag at end of experiment, initial mass was 0.200 g
coarse_litter_g: remaining litter in coarse 5 mm mesh litter bag (10cm x 28cm) at end of experiment, initial mass was 0.500 g
loose_coarse_litter_g: coarse litter in the tree hole (collected with a 850 micron mesh) that was not in a litter bag, either from litter added at the start of experiment (2g), or from a coarse mesh letter bag that opened (see coarse_bag_open variable), or natural leaf fall
loose_fine_litter_g: the same origin as loose_coarse_litter_g, but composed of fragments that passed through the 850 micron mesh but were capture on Fisherbrand Fluted Qualitative Circled Filter Paper
coarse_bag_open: binary variable indicating if coarse litter bag opened (1) or not (0) during the experiment; this was a problem specific to coarse bags due to weak glue
fine_bag_open: binary variable indicating if fine litter bag opened (1) or not (0) during the experiment; rare as fine mesh bags were heat sealed
bags_out: binary variable indicating if litter bags were pulled out (1) or not (0) during the experiment by wildlife
litter_available_for_insects_g: this variable combines the litter loose in the tree hole (loose_coarse_litter_g + loose_fine_litter_g) as well as any litter that was still in coarse mesh litter bags (coarse_litter_g) as insects can pass through the coarse mesh
perc_dec_coarse: percent of initial 0.500g litter lost from intact coarse mesh bags over the experiment; NA otherwise
perc_dec_fine: percent of initial 0.200g litter lost from intact fine mesh bags over the experiment; NA otherwise
treehole_dry: a binary variable (0 = with water, 1 = without water) indicating if the tree hole was with or without water when collected
notes: information on tree hole status at end of experiment
all variables starting with string origin: indicates if the original raw value of that variable for that treehole used (raw), or if the variable was missing and estimated through kriging (kriged) or multivariate/univariate methods (impute).
Simpson: Simpsons diversity index for invertebrates
Shannon: Shannon diversity index for invertebrates (results similar to Simpsons so not presented in manuscript)
Abundance: total number of living invertebrates
invert.drymass.mg: total biomass of invertebrates
Dead: total number of dead invertebrates
no2_bin: bins of NO2 from Wang et al. 2013 Atmospheric Environment 64:312319, manually extracted.
Sharing/access Information
Links to other publicly accessible locations of the data:
Was data derived from another source? Yes (population density and percent low income)
If yes, list source(s):https://www97.statcan.gc.ca/CSGE_EGSC/csge-main/index-en.html
Methods
These methods are quoted in abbreviated form from the original article [please also see README.md file for details on each script and data file, including description of every variable]:
We installed 73 artificial tree holes (hereafter tree holes) throughout Greater Vancouver, specifically the cities of Vancouver, Abbottsford, Burnaby, Chilliwack, Delta, Maple Ridge, New Westminster, North Vancouver, Port Moody, Richmond, Surrey, and West Vancouver. We constructed artificial tree holes from black plastic buckets (950 ml, height: 12.2cm, diameter:11.5cm). Near the rim, we drilled 1-cm holes for water overflow and covered these with 1mm mesh to prevent loss of insects and litter (Figure 2a). We attached each tree hole to a deciduous tree with a cable tie, about 1.3 m above ground, before adding leaf litter and bottled spring water. The leaf litter consisted of dried (60°C for two days) and pre-weighed Acer macrophyllum (Sapindaceae) leaves collected in November 2020, both loose (2.50 g) and in a 0.5 mm mesh leaf bag (0.200 g). We filled each tree hole with ~750 ml spring water (Western FamilyTM). Community scientists were instructed to monitor water level in the tree holes during the experiment, topping up tree holes when they became half-empty with extra bottles of water (same brand) that we provided. We also added an iButtonTM temperature logger (Maxim Integrated, San Jose, CA, USA; models DS1921G, DS1921Z, and DS1922L) wrapped in ParafilmTM (Beemis Company, Neenah, WI, USA) and programmed it to collect data every hour for 85 days. We added a small stick to assist ovipositing insects to perch or pupating insects to emerge.
We installed all tree holes 21–28 March 2021. We visited all tree holes 17–30 May 2021, to collect data on water chemistry (pH, chlorophyll-a concentration), light availability (canopy cover), potential oviposition cues (host tree diameter, nearby standing water), and potential source populations (distance to water bodies). We measured water pH directly using a calibrated OaktonⓇ pH 450 pH meter. To estimate chlorophyll-a concentration, we extracted 25 mL of water, filtered it through a glass microfiber (0.7 μm) filter, and froze the filters. In the lab, we extracted chlorophyll-a on filters with 90%-acetone. We used a Trilogy Laboratory Fluorometer (Turner Designs, San Jose, CA, USA) to determine chlorophyll-a concentration following Wasmund et al. (2006). To measure canopy cover, we took a photograph directly up by placing a smartphone flat on the tree hole and then used ImageJTM to differentiate open sky from any obstructing cover. We searched within 30m of tree holes for sources of persistent standing water, such as buckets, birdbaths, and tires holding >400 mL water.
We retrieved tree holes 1–10 July 2021, in the same order as installation, standardizing the experimental duration to 14 weeks (± 4 days). Once in the lab, we measured water pH as before, turbidity with a portable Turner AquaflorⓇ fluorometer, and froze a 5mL volume of water for later nutrient analysis. To analyse nutrients NO2-, NO3-, NH4+, and PO4-3, we loaded water samples onto 96-well plates with standards corresponding to the nutrient of interest. We then added the relevant reagents to all wells and compared the absorbance of the samples to standards using a SpectraMax M2e spectrophotometer (Molecular Devices, San Jose, CA, USA). We averaged two measurements per sample. As NO2-, NO3-, and NH4+ represent three steps of the dissolved inorganic nitrogen (DIN) cycle, we summed their concentrations in a single measure of DIN.
We retrieved the remaining leaf fragments in litter bags with tweezers, washing biofilm from them before drying (two days at 60°C) and determining their combined mass. Decomposition was quantified as the percent dry mass lost. We also collected all loose debris in tree holes manually and by filtering through a pre-weighted Fisherbrand™ Fluted Qualitative Circled Filter Paper before drying and determining dry mass. We recorded the total volume of water present in each tree hole. Finally, we searched tree hole contents for macroscopic (>1mm) invertebrates in small aliquots in white trays. We sorted invertebrates into morphospecies, preserved them in 70% ethanol, and later identified them to family or genus level with identification keys. We used allometric equations to estimate dry body mass of invertebrates from body length (mm), either at the individual (species > 10 mm) or species level (hellometry R package, P. Rogy). We preserved a few voucher specimens in 95% ethanol for DNA barcoding to unambiguously assign species identities. For DNA extraction, we used QIAGEN® DNeasy Blood & Tissue Kit. We amplified the barcoding region of the mitochondrial Cytochrome Oxidase I (COI) gene with the universal primers LCO 1490 and HCO 2198 (Folmer et al. 1994). PCR products were sequenced by Psomagen, Inc. The chromatograms were assembled with Geneious Prime® v. 2022.2.2, and the resulting sequences were compared with GenBank and BOLD databases.
Usage notes
R version 4.2.0 (2022-04-22 ucrt)
Versions of R packages used are indicated at the end of archived R scripts under the comment #Session Info.