Untangling the web: How spider traits link with management practices in agroecosystems
Data files
Jun 26, 2025 version files 110.40 KB
-
L_spiders.xlsx
71.16 KB
-
Q_traits.xlsx
9.05 KB
-
R_filters.xlsx
6.48 KB
-
R_veg.xlsx
10.66 KB
-
README.md
13.05 KB
Abstract
This dataset accompanies the manuscript "Untangling the Web: How Spider Traits Link with Management Practices in Agroecosystems". It includes the raw input matrices used to explore the functional composition of spider communities across agroecological and conventional cereal fields in the Swiss lowlands.
The dataset consists of four tables:
(1) L_spiders: Relative abundances of 84 spider species across 22 field sites, derived from standardized multi-method sampling (pitfall traps, sweep netting, vacuum suctioning).
(2) Q_traits: Functional traits of the observed spider species, including hunting guild, dispersal ability, vegetation stratum, preferred habitat, and body size.
(3) R_filters: Site-level environmental and management variables such as pesticide input (TFI), nitrogen application, and landscape metrics.
(4) R_veg: Local vegetation characteristics based on percentage cover of plant species per site. From this, community-weighted means (CWMs) for plant height, specific leaf area, and leaf nitrogen content, as well as Rao’s functional diversity (RaoQ), were computed.
These data were prepared for RLQ and fourth-corner analyses to assess how spider traits respond to environmental filtering through direct and vegetation-mediated pathways.
Dataset DOI: 10.5061/dryad.59zw3r2m4
Description of the data and file structure
This dataset was collected as part of a trait-based ecological study examining how spider communities respond to agricultural diversification in the Swiss lowlands. The study involved a paired field design with 22 cereal fields (wheat, barley, oilseed rape), each managed under either conventional or agroecological practices. Agroecological fields avoided synthetic pesticides and incorporated diversification strategies such as wildflower strips and mechanical weeding.
Spiders were sampled using three complementary methods—pitfall trapping, sweep netting, and vacuum suctioning—across multiple vegetation strata. Local vegetation was recorded by estimating plant cover within quadrats, and management/environmental data were collected through field records and spatial analysis. Functional traits of spiders and plants were compiled to explore how environmental filters structure trait distributions in spider communities using RLQ and fourth-corner analyses.
Files and variables
File: L_spiders.xlsx
Description: Raw spider sampling data
This file contains the unaggregated field-level data used to derive species abundances and trait associations. Each row represents an individual observation of a spider species at a given site and sampling round. No missing values (zeros imply no individuals recorded). This file has been used to generate the site-by-species matrix (L) and supports more detailed ecological or taxonomic inquiries (e.g., by sex, method, or date).
Variables
- unit_ID: Unique identifier for the field site and year (e.g., 79_1_2023)
- Date: Sampling date (YYYY-MM-DD)
- Round: Sampling round
- Crop: Crop type at the field (e.g., wheat, barley, OSR)
- Year: Sampling year (2023)
- Farm: Farm ID number
- Management: Management type as part of field identifier where 1 = conventional, 2 = agroecological
- Treatment: Management group label, where 1 = conventional, 2 = agroecological
- Transect: Transect code where c is at 20m to field margin
- Order, Family, Genus, species: Taxonomic classification of the spider
- Method: Sampling method; pitfall trapping (PFT), vacuum suctioning (vac), sweep netting (SN)
- adult_tot: Total number of adults recorded
- females: Number of adult females
- males: Number of adult males
- Genus.species: Concatenated taxonomic name
File: Q_traits.xlsx
Description: Spider functional traits
This file provides trait information for spider species observed in the study. Each row corresponds to one species and includes multiple ecological and functional trait descriptors used in RLQ and fourth-corner analyses. No missing values are present. This file has been used to constitute the trait-by-spider (Q) matrix.
Variables
- Genus.species: Concatenated genus and species name
- hunting_guild: Categorical variable indicating foraging strategy (e.g., WebBuilding_Sheet, WebBuilding_Orb, ActiveHunters)
- strt_cluster: Vegetation stratum preference, derived from categorical groupings described in the manuscript, as well as below* (e.g., ground, herb, herb/tree)
- eco_cluster: Habitat affinity cluster, derived from categorical groupings described in the manuscript, as well as below* (e.g., forest/grass, agri/grass, agri/open, forest/wet)
- Dispersal: Categorical variable for dispersal ability (low, medium, high), reflecting ballooning or vagility potential
- avg_body_len: Mean adult body length in millimeters (mm)
*Notes on the clustering:
Original values for vegetation stratum and habitat:
| Vegetation stratum | vegetation stratum occupied; S = soil / under stones, G = ground-living, H = herb / litter layer, T = trees and bushes / forest |
|---|---|
| Habitat | Preferred habitat (ecological distribution); G = Grassland (incl. meadow), F= forest (incl. Trees and Bushes), A = agriculture, S = bare ground, sand, pebble, rocky, litter, R = ruderal, herbs/shrubs, W = wetlands, O = open areas |
Each spider might have been recorded in only one or several vegetation strata and habitats, thereby resulting in a very high number of different categorical values. To reduce the number of categories for the vegetation stratum and habitats, we calculated Jaccard dissimilarity in a dummy coded stratum, respectively habitat matrix (i.e. 1 / 0 if attributed to a stratum/habitat or not), from which we performed a hierarchical clustering. The number of clusters were validated by back-checking with the ecological meaningfulness and by plotting the silhouette widths to find the optimal number of clusters (Kaufman et al., 1990).
File: R_filters.xlsx
Description: Environmental Filters (Renv)
This file contains site-level environmental variables used to characterize management intensity and landscape context for each sampled cereal field. All values are untransformed. For the analysis, all variables of the Renv matrix were z-transformed. Management and crop type themselves have not been considered as filtering variables, but as grouping variables (partial RLQ, see methods). No missing values.
Variables
- unit_ID: Unique identifier for each field site and year (e.g., 79_1_2023)
- Crop: Crop at the site (wheat, barley, OSR)
- TFI_herb_ins: The Treatment Frequency Index (TFI) quantifying amount of herbicide and insecticide applied per amount allowed according to federal regulations (scaled index, aggregated index across the growing season). Referred to as "TFI" in the manuscript.
- Ndisp: Nitrogen applied through fertilizers. Not differentiated between mineral and organic (manure) (in kg/ha). Referred to as "Nitro" in the manuscript.
- Shannon_veg: Shannon diversity index of plant species recorded per site. Referred to as "VegeDiv" in the manuscript.
- mecha: Number of mechanical interventions conducted on the soil after harvest of the previous crop, until the harvest. Including rolling, stubble cultivation, false seed preparation, and weeding. Every field was ploughed once; hence, this was not considered (integer count).
- n_patches: Landscape Patchiness, i.e. number of land-use patches (within a 500 m radius).
- expSh: Landscape Diversity, i.e. Exponential of the Shannon landscape diversity index (unitless). Referred to as "LandDiv" in the manuscript.
File: R_veg.xlsx
Description: Local vegetation characteristics
This file summarizes plant community composition and plant trait characteristics at each field site. It includes community-weighted means (CWMs) of plant functional traits, where data had been extracted from publicly available data from selected European Datasets (see Appendix 1 of Manuscript) from the TRY database (Kattge et al., 2020) and the relative abundance of dominant plant families. No missing values are present.
- unit_ID: Unique identifier for each field site and year (e.g., 79_1_2023)
- SLA: Community-weighted mean of specific leaf area (mm²/mg)
- leafN: Community-weighted mean of leaf nitrogen content (% dry mass)
- height: Community-weighted mean plant height (m)
- [Various plant family columns]: Number of individual records (or %cover converted to ordinal scale accoring to Van der Maarel, 1979) per plant family (e.g., Apiaceae, Asteraceae, Fabaceae, Poaceae, etc.), representing relative abundance across the site.
- RaoQ: index of functional plant species diversity, RaoQ (unitless)
Code/software
The data analyses were conducted using R (version 4.2.2), an open-source statistical computing environment. The following R packages were used to process, analyze, and visualize the data:
FD: for functional diversity metrics and community-weighted means (dbFD())ade4: for correspondence analysis (dudi.coa()), PCA with mixed data (dudi.hillsmith()), and RLQ and fourth-corner analysesadegraphics: for multivariate visualization using ADEgS()factoextra: to assist with interpreting PCA outputsvegan: for ordination tools and diversity metrics
Workflow Summary and Useful Code Snippets
Step 1: Detect sampling effects
Correspondence Analysis (CA) was applied to the species table L to detect potential sampling artefacts (e.g., crop identity effects):
CA_L <- dudi.coa(dat$spe, scannf = FALSE) s.class(CA_L$li, dat$fac.crop, ellipseSize = 0, chullSize = 1)
Step 2: Partial out crop effects from the environmental table
Environmental data (R) was first analyzed using a Hill-Smith PCA, then adjusted via within-class analysis:
env_pca <- dudi.hillsmith(dat$env[, -1], row.w = CA_L$lw, scannf = FALSE) part_pca_crop <- wca(env_pca, dat$fac.crop2, nf = 5, scannf = FALSE)
Step 3: RLQ analysis
RLQ was run using the crop-adjusted R, the CA of L, and the Hill-Smith PCA of Q:
PCA_Q <- dudi.hillsmith(dat$traits, row.w = CA_L$cw, scannf = FALSE) pca_wit <- dudi.pca(part_pca_crop$tab, row.w = CA_L$lw, scannf = FALSE, center = FALSE, scale = FALSE) partial_RLQ <- rlq(pca_wit, CA_L, PCA_Q, scannf = FALSE) summary(partial_RLQ)
Step 4: Significance testing
Global test of RLQ structure:
test_rlq <- randtest(partial_RLQ, 4999)
Step 5: Fourth-corner analysis
Links between traits and environmental variables were assessed with permutation tests:
four.comb.env.adj <- fourthcorner(dat$env, dat$spe, dat$traits, modeltype = 6, p.adjust.method.G = "fdr", p.adjust.method.D = "fdr", nrepet = 999)
A second analysis was performed using vegetation filters:
four_C_veg_adj <- fourthcorner(veg_filters, dat$spe, dat$traits, modeltype = 6, p.adjust.method.G = "none", p.adjust.method.D = "fdr", nrepet = 49999)
Access information
Spider Trait Data was derived from the following sources: most information stems from an unpublished, private dataset by Gilles Blandenier (expert communication, “pers. comm.”, with data on dispersal abilities from (Bell et al., 2005; Blandenier, 2009). We integrated data form araneae.nmbe.ch (Nentwig et al., 2024) and open-source data from the World Spider Catalogue (Naturhistorisches Museum Bern, 2024).
To calculate community weighted mean plant characteristics per site we used data extracted from 24 European and global datasets, with some additions from FloraVeg.EU (Chytrý et al., 2024). Specifically, we used data from TRY Data Request 30973 (Only public data were requested):
Datasets: Leaf and Whole Plant Traits Database, The LEDA Traitbase, Categorical Plant Traits Database, BiolFlor Database, Reich-Oleksyn Global Leaf N, P Database, BIOPOP: Functional Traits for Nature Conservation, European Mountain Meadows Plant Traits Database, PLANTSdata USDA, GLOPNET - Global Plant Trait Network Database, Global Respiration Database, Italian Alps Plant Traits Database, Flora d’ Italia Functional Traits Hoard (FIFTH), Xylem Functional Traits (XFT) Database, The Xylem/Phloem Database, Functional traits explaining variation in plant life history strategies, French Alps Trait Data, Leaf Traits and Seed Mass of Cover Crops, The Global Leaf Traits, FRED - Fine Root Ecology Database, TRY Categorical Traits Dataset (update 2018), BROT 2.0, SwissNationalPark_Engadine, Competition, and Jena Experiment Traits.
Trait List (ID, original trait name): Leaf nitrogen (N) content per leaf dry mass [14], Leaf compoundness [17], Plant woodiness [38], Leaf type [43], Fruit type [99], Shoot branching type; shoot branching architecture [140], Flower color [207], Plant height vegetative [3106], Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA): undefined if petiole is in- or excluded [3117]
Species List: 909, 1593, 1869, 1905, 2065, 2172, 2496, 2869, 2872, 3104, 3724, 3944, 4036, 4043, 4185, 4341, 4489, 4614, 5134, 6342, 7173, 7314, 7725, 8255, 8440, 8453, 9866, 10226, 10295, 10304, 10666, 11567, 11583, 11781, 11783, 12159, 13016, 14244, 14512, 216684, 15171, 15174, 16700, 17041, 19822, 20359, 20893, 20927, 21627, 23444, 23874, 23920, 24604, 25473, 25483, 25487, 25526, 26052, 26069, 26084, 26336, 28128, 28295, 29305, 29545, 30052, 31479, 31809, 32132, 32245, 32357, 32499, 32728, 34016, 34017, 35262, 35272, 35274, 35671, 37495, 38977, 39564, 40033, 40195, 40307, 40533, 41191, 41193, 41305, 41399, 41549, 42016, 42541, 42544, 42805, 42870, 42893, 61760, 43221, 43716, 44304, 44335, 45587, 45737, 45748, 45914, 46070, 46948, 47451, 47498, 48156, 50239, 50257, 50304, 50348, 50913, 50914, 50929, 51627, 51634, 53018, 99439, 54268, 54918, 54945, 54956, 54957, 54991, 54998, 54999, 55128, 55224, 55598, 55801, 55804, 55994, 56010, 56236, 56252, 56268, 56277, 56356, 56395, 56396, 56477, 56491, 574, 48, 24558, 10391, 48135
Spiders were sampled in 22 cereal fields (wheat, barley, oilseed rape) in the Swiss lowlands as part of the PestiRed project. Each field pair consisted of one conventionally and one agroecologically managed site, matched by crop type. Sampling took place during the 2023 growing season using three complementary methods—pitfall trapping, sweep netting, and vacuum suctioning—to cover different vegetation strata. Only adult spiders were retained and identified to species level. A total of 55 species occurring in at least two sites were included in the analysis.
For the analysis, the spider species abundances across sites have been Hellinger-transformed (L matrix). The Q matrix includes five functional traits: hunting guild, dispersal ability, vegetation stratum, habitat type, and body size. Trait data were compiled from expert sources and online databases, then clustered to reduce dimensionality.
The Renv matrix consists of seven z-transformed environmental variables reflecting field-level management (e.g., pesticide use, mechanical interventions), nitrogen input, vegetation diversity, and landscape structure (e.g., diversity and patchiness of land use within 500m). The Rveg matrix represents local vegetation characteristics, including community-weighted means of specific leaf area, plant height, and leaf nitrogen content, dominant plant family per site, and functional vegetation diversity (RaoQ), all z-transformed.
These matrices were used in RLQ and fourth-corner analyses to explore trait–environment relationships.
