Data from: Uniqueness of tree stand composition and soil microbial communities are related across urban spruce-dominated forests
Data files
Dec 14, 2023 version files 3.41 MB
-
Korhonen_et_al_2023_OTU_metadata_Bacteria.csv
222.12 KB
-
Korhonen_et_al_2023_OTU_metadata_Fungi.csv
2.24 MB
-
Korhonen_et_al_2023_OTU_table_Bacteria.csv
116.11 KB
-
Korhonen_et_al_2023_OTU_table_Fungi.csv
789.27 KB
-
Korhonen_et_al_2023_Site_metadata.txt
29.93 KB
-
README.md
10.34 KB
Jul 25, 2024 version files 3.41 MB
Abstract
The dataset contains data obtained from urban spruce-dominated forests in southern Finland where we have measured tree stand composition, forest management history, soil chemical properties, and soil microbial communities. Data files include (1) microbial OTU tables describing microbial community composition (sequence read counts of Operational Taxonomic Units) across the study plots, (2) taxonomic assignments and other metadata related to OTUs, and (3) measured and calculated variables describing the characteristics of sites and their microbial assemblages (site metadata).
README: Data from: Uniqueness of tree stand composition and soil microbial communities are related across urban spruce-dominated forests
https://doi.org/10.5061/dryad.h44j0zps5
Data originate from urban spruce-dominated forests in southern Finland where we have measured tree stand composition, forest management history, soil chemical properties and soil microbial communities. Data files include (1) microbial OTU tables describing microbial community composition (sequence read counts of Operational Taxonomic Units) across the study plots, (2) taxonomic assignments and other metadata related to OTUs, and (3) measured and calculated variables describing the characteristics of sites and their microbial assemblages (site metadata).
Description of the data and file structure
DATA & FILE OVERVIEW
DATA-SPECIFIC INFORMATION FOR: 'Korhonen et al 2023 OTU_table_Bacteria.csv'
1. Number of variables/columns: 491 data columns + 1 header column (site names)
2. Number of cases/rows: 73 data rows + 1 header row (OTU names)
3. Column separator: ;
DATA-SPECIFIC INFORMATION FOR: 'Korhonen et al 2023 OTU_table_Fungi.csv'
1. Number of variables/columns: 4157 data columns + 1 header column (site names)
2. Number of cases/rows: 73 data rows + 1 header row (site names)
3. Column separator: ;
DATA-SPECIFIC INFORMATION FOR: 'Korhonen et al 2023 OTU_metadata_Bacteria.csv'
1. Number of variables/columns: 17
2. Number of cases/rows: 491 data rows + 1 header row
3. Column separator: ;
4. Column explanations
Col 1: OTU_ID Unique identifiers of bacterial OTUs
Col 2: OTU_ID_original Unique identifiers of bacterial OTUs generated by the bioinformatic pipeline
Col 3: RepSeq Representative sequences of bacterial OTUs
Col 4: D Taxonomic assignment to Domain
Col 5: D% Support for taxonomic assignment to Domain (values 80-100, NA when <80)
Col 6: P Taxonomic assignment to Phylum
Col 7: P% Support for taxonomic assignment to Phylum (values 80-100, NA when <80)
Col 8: C Taxonomic assignment to Class
Col 9: C% Support for taxonomic assignment to Class (values 80-100, NA when <80)
Col 10: O Taxonomic assignment to Order
Col 11: O% Support for taxonomic assignment to Order (values 80-100, NA when <80)
Col 12: F Taxonomic assignment to Family
Col 13: F% Support for taxonomic assignment to Family (values 80-100, NA when <80)
Col 14: G Taxonomic assignment to Genus
Col 15: G% Support for taxonomic assignment to Genus (values 80-100, NA when <80)
Col 16: S Taxonomic assignment to Species
Col 17: S% Support for taxonomic assignment to Species (values 80-100, NA when <80)
DATA-SPECIFIC INFORMATION FOR: 'Korhonen et al 2023 OTU_metadata_Fungi.csv'
1. Number of variables/columns: 20
2. Number of cases/rows: 4157 data rows + 1 header row
3. Column separator: ;
4. Column explanations
Col 1: OTU_ID Unique identifiers of fungal OTUs
Col 2: OTU_ID_original Unique identifiers of fungal OTUs generated by the bioinformatic pipeline
Col 3: RepSeq Representative sequences of fungal OTUs
Col 4: k Taxonomic assignment to Domain
Col 5: k% Support for taxonomic assignment to Kingdom (values 80-100, NA when <80)
Col 6: p Taxonomic assignment to Phylum
Col 7: p% Support for taxonomic assignment to Phylum (values 80-100, NA when <80)
Col 8: c Taxonomic assignment to Class
Col 9: c% Support for taxonomic assignment to Class (values 80-100, NA when <80)
Col 10: o Taxonomic assignment to Order
Col 11: o% Support for taxonomic assignment to Order (values 80-100, NA when <80)
Col 12: f Taxonomic assignment to Family
Col 13: f% Support for taxonomic assignment to Family (values 80-100, NA when <80)
Col 14: g Taxonomic assignment to Genus
Col 15: g% Support for taxonomic assignment to Genus (values 80-100, NA when <80)
Col 16: s Taxonomic assignment to Species
Col 17: s% Support for taxonomic assignment to Species (values 80-100, NA when <80)
Col 18: Comment Lead author's personal comments on taxonomic assignment
Col 19: LifeStyle Primary lifestyle based on genus-level taxonomic assignment from FungalTraits database
Col 20: FinnishRedList2019 Species' Red List category (NT, DD, NE) according to the 2019 Red List of Finnish species. Value is NA if category is LC or undefined.
DATA-SPECIFIC INFORMATION FOR: 'Korhonen et al 2023 Site_metadata.txt'
1. Number of variables/columns: 63
2. Number of cases/rows: 73 + 1 header row
3. Column separator: ;
4. Decimal separator for numeric variables: ,
5. Column explanations
Col 1: Site Unique identifier of the sampling site, same as site names in OTU tables
Col 2: City City of the sampling site (Helsinki, Espoo, Lahti, Tampere or Vantaa)
Col 3: Region Region of the sapling site (Helsinki region, Lahti or Tampere)
Col 4: Latitude Y coordinate (WGS 84)
Col 5: Longitude X coordinate (WGS 84)
Col 6: Local_dbMEM1 Distance-based Moran’s eigenvector calculated from geographic coordinates of study plots accounting for spatial autocorrelations within regions (< 50 km distances)
Col 7: Date Day when soil sample was collected
Col 8: BasA_CutStumps_DecayClass_1 Basal area of decay class 1 cut stumps with diameter >= 10 cm within the study plot (m²/ha)
Col 9: BasA_CutStumps_DecayClass_2 Basal area of decay class 2 cut stumps with diameter >= 10 cm within the study plot (m²/ha)
Col 10: BasA_CutStumps_DecayClass_3 Basal area of decay class 3 cut stumps with diameter >= 10 cm within the study plot (m²/ha)
Col 11: BasA_CutStumps_DecayClass_4 Basal area of decay class 4 cut stumps with diameter >= 10 cm within the study plot (m²/ha)
Col 12: BasA_CutStumps_DecayClass_5 Basal area of decay class 5 cut stumps with diameter >= 10 cm within the study plot (m²/ha)
Col 13: Logging_recent Principal component corresponding to recent logging intensity (least decayed cut stumps)
Col 14: Logging_old Principal component corresponding to old logging intensity (most decayed cut stumps)
Col 15: Stem_density Number of tree stems with DBH >= 5 cm within the study plot (number/ha)
Col 16: Vascular_plant_cover Estimated proportion of the sample plot field layer that was covered by dwarf shrubs or herbaceous vascular plants
Col 17: BasA_Picea Basal area of Picea trees with DBH >= 5 cm within the study plot (m²/ha)
Col 18: BasA_Pinus Basal area of Pinus trees with DBH >= 5 cm within the study plot (m²/ha)
Col 19: BasA_Betula Basal area of Betula trees with DBH >= 5 cm within the study plot (m²/ha)
Col 20: BasA_Populus Basal area of Populus trees with DBH >= 5 cm within the study plot (m²/ha)
Col 21: BasA_Sorbus Basal area of Sorbus trees with DBH >= 5 cm within the study plot (m²/ha)
Col 22: BasA_Alnus Basal area of Alnus trees with DBH >= 5 cm within the study plot (m²/ha)
Col 23: BasA_Salix Basal area of Salix trees with DBH >= 5 cm within the study plot (m²/ha)
Col 24: BasA_Quercus Basal area of Quercus trees with DBH >= 5 cm within the study plot (m²/ha)
Col 25: BasA_Acer Basal area of Acer trees with DBH >= 5 cm within the study plot (m²/ha)
Col 26: BasA_Corylus Basal area of Corylus stems with DBH >= 5 cm within the study plot (m²/ha)
Col 27: BasA_Tilia Basal area of Tilia trees with DBH >= 5 cm within the study plot (m²/ha)
Col 28: BasA_Prunus Basal area of Prunus trees with DBH >= 5 cm within the study plot (m²/ha)
Col 29: BasA_Abies Basal area of Abies trees with DBH >= 5 cm within the study plot (m²/ha)
Col 30: ECM_tree_richness Number of ectomycorrhizal tree genera present in the sample plot (Abies, Alnus, Betula, Corylus, Picea, Pinus, Populus, Quercus, Salix, Tilia)
Col 31: TreeDiv Shannon diversity of tree species calculated from relative abundances in terms of basal area
Col 32: LCBD_Trees Relative uniqueness of a site in terms of tree species composition
Col 33: BuiltLand_200m Proportion of built land surface within 200 m radius around sample plot
Col 34: CN_ratio Mass ratio of carbon and nitrogen in soil sample
Col 35: pH Soil acidity
Col 36: Al Aluminum content in soil samples (mg/kg dry weight)
Col 37: B Boron content in soil samples (mg/kg dry weight)
Col 38: C Carbon content in soil samples (proportion [0-1] of dry weight)
Col 39: Ca Calcium content in soil samples (mg/kg dry weight)
Col 40: Cr Chromium content in soil samples (mg/kg dry weight)
Col 41: Cu Copper content in soil samples (mg/kg dry weight)
Col 42: Fe Iron content in soil samples (mg/kg dry weight)
Col 43: K Potassium content in soil samples (mg/kg dry weight)
Col 44: Mg Magnesium content in soil samples (mg/kg dry weight)
Col 45: Mn Manganese content in soil samples (mg/kg dry weight)
Col 46: Na Sodium content in soil samples (mg/kg dry weight)
Col 47: Ni Nickel content in soil samples (mg/kg dry weight)
Col 48: P Phosphorus content in soil samples (mg/kg dry weight)
Col 49: Pb Lead content in soil samples (mg/kg dry weight)
Col 50: S Sulphur content in soil samples (mg/kg dry weight)
Col 51: Zn Zinc content in soil samples (mg/kg dry weight)
Col 52: Ergosterol Indicator of fungal biomass in soil samples (µg / g organic content)
Col 53: Proteobacteria:Acidobacteria Ratio of relative read abundance of bacterial phyla Proteobacteria (i.e., Pseudomonadota) and Acidobacteriota
Col 54: ECM_biomass Relative read abundance of ectomycorrhizal fungi * ergosterol (µg / g organic content)
Col 55: SAP_biomass Relative read abundance of saprotrophic fungi * ergosterol (µg / g organic content)
Col 56: Bacterial_diversity Rarefied estimate of bacterial Shannon Diversity
Col 57: ECM_diversity Rarefied estimate of ectomycorrhizal fungal Shannon Diversity
Col 58: SAP_diversity Rarefied estimate of saprotrophic fungal Shannon Diversity
Col 59: LCBDrepl_BAC Local Contribution to the replacement component of Beta Diversity for bacterial, i.e., a measure of a site’s relative importance in bacterial OTU turnover across sites
Col 60: LCBDrepl_ECM Local Contribution to the replacement component of Beta Diversity for ectomycorrhizal fungi, i.e., a measure of a site’s relative importance in ectomycorrhizal fungal species turnover across sites
Col 61: LCBDrepl_SAP Local Contribution to the replacement component of Beta Diversity for saprotrophic fungi, i.e., a measure of a site’s relative importance in saprotrophic fungal species turnover across sites
Col 62: RareECM Numer of rare (present in less than 10% of sites with at least 1 promille relative abundance) ectomycorrhizal fungal species
Col 63: RareSAP Numer of rare (present in less than 10% of sites with at least 1 promille relative abundance) saprotrophic fungal species
Methods
Method description below are summarized from the original publication.
Study area and sample plots
We collected data from 73 forest plots distributed in three urban centers in southern Finland: Helsinki region (45 sites in cities of Helsinki, Espoo and Vantaa), Lahti (8 sites), and Tampere (20 sites). Sample plots consisted of four interconnected 20 m × 20 m squares that were placed inside forest stands.
Measurement of forest stand characteristics
All living trees ≥ 5 cm at breast height (1.3 m) and all cut tree stumps with diameter ≥ 10 cm at cut surface were measured within the study plots. We assessed the degree of urbanization around sample plots by calculating the proportion of built land surface within 200 m radius around sample plot center based on Corine Land Cover 2018 GIS-dataset.
Soil sampling
Soil sampling was done between May 19th and July 19th 2022. Samples were collected from 16 regularly spaced grid points across each 0.16 ha study plot. Minimum distance between two sampling points was 10 m. At each point, we collected ca. 0.5 dl of material from the humus layer between the undecomposed litter layer and mineral soil. After removing the undecomposed surface litter and roots, material was extracted with a DNA-sterilized hand shovel down to the boundary between humus layer and mineral soil to a maximum depth of 15 cm. Samples were sieved through 2 mm mesh and stored in -20°C until further processing.
Soil chemical analyses
We measured the carbon (C) and nitrogen (N) content of the soil samples from combustion products with an elemental analyzer. We measured ergosterol as biomarker of fungal biomass. Ergosterol was extracted from 0.25 g soil samples and quantified with liquid chromatography.
DNA extraction and sequencing
DNA was extracted from two ca. 200 mg portions of material from each sample. For bacterial metabarcoding, V4 region of the 16S ribosomal RNA gene was amplified with primers 515F (GTGCCAGCMGCCGCGGTAA) and 806R (GGACTACHVGGGTWTCTAAT). For fungal metabarcoding, the Internal Transcribed Spacer 2 (ITS2) of the nuclear ribosomal RNA coding region was amplified with primers ITS3-2024F (GCATCGATGAAGAACGCAGC) and ITS4-2409R (TCCTCCGCTTATTGATATGC). Indexed amplicons were sequenced with Illumina NovaSeq 6000 (paired-end 250 bp).
Bioinformatics
Sequence reads were demultiplexed, and index and primer sequences were removed from paired-end reads (data available under BioProject ID PRJNA1012880 in NCBI sequence read archive). Individual R1 and R2 reads were first quality filtered, and then assembled. Assembled reads were chimera filtered de novo. Remaining reads were clustered into OTUs with 98% similarity threshold. OTUs were taxonomically assigned with 80% confidence cutoff using Naïve Bayesian Classifier trained with bacterial SILVA (v138) and fungal UNITE (v9, dynamic, all eucaryotes) databases in mothur v.1.36.1. Fungal OTUs were further assigned to functional groups according to FungalTraits database based on genus-level taxonomical assignments. OTU tables were finally filtered by discarding bacterial OTUs that had less than 1‰ relative abundance and fungal OTUs that had less 0.1‰ relative abundance in all samples and OTUs that had maximum read counts in negative controls.
Data Preparation
We calculated diversity indices for all bacteria, ectomycorrhizal (ECM) fungi and saprotrophic (SAP) fungi. We considered SAP fungi to include soil, litter, wood and unspecified saprotrophs. We estimated Shannon diversity with sample read depth of 30k for bacteria, 60k for ECM fungi, and 55k for SAP fungi using the estimateD function in the R package iNEXT (v3.0.9). To quantify the relative contributions of sites to taxonomic turnover of bacteria, ECM fungi and SAP fungi, we calculated local contributions to species replacement (LCBDrepl) based on distance matrices representing the replacement component of beta diversity using the beta.div.comp and LCBD.comp functions in the R package adespatial (v0.3-21). We calculated LCBDrepl for bacteria based on the quantitative indices of replacement (Sørensen, Podani family) and LCBDrepl for fungi based on presence-absence (Jaccard, Podani family). We counted OTUs with relative abundance >0.1‰ as present. To avoid conflating fungal taxa with (potentially artefactual) intraspecific sequence variants and other sequence reads with uncertain taxonomic affinity, we restricted the analysis to fungal OTUs that were identified to species-level and pooled together OTUs that were assigned to the same species before determining presence-absence. To quantify management intensity in forest sites, we reduced the logging history data (basal areas of cut stumps in five different decay classes) into two main gradients by running a principal component analysis on square root transformed basal areas of cut stumps and extracting site scores for the two first principal components. To quantify forest tree diversity at each site, we calculated the number of ECM tree species (ECM tree richness), Shannon diversity of tree species (TreeDiv) based on relative basal areas, and an index of relative uniqueness of tree stand composition (LCBD-Trees). LCBD Trees was calculated from a distance matrix (Sørensen, Podani family) derived from tree composition data (square root transformed basal areas of tree species).