Data from: Co-Mast: Harmonized seed production data for woody plants across U.S. long term research sites
Data files
Sep 13, 2024 version files 24.14 MB
-
attribute-citations.csv
40.33 KB
-
individual_seed_production.csv
23.32 MB
-
phylogeny.tr
4.24 KB
-
plot_locations.csv
5.55 KB
-
plot_summarized_seed_data.csv
678.59 KB
-
README.md
70.65 KB
-
species_attributes.csv
15.48 KB
Abstract
Plants display a range of temporal patterns of inter-annual reproduction, from relatively constant seed production to ‘mast seeding’, the synchronized and highly variable interannual seed production of plants within a population. Previous efforts have compiled global records of seed production in long-lived plants to gain insight into seed production, forest and animal population dynamics, and the effects of global change on masting. Existing datasets focus on seed production dynamics at the population scale, but are limited in their ability to examine community-level mast seeding dynamics across different plant species at the continental scale. We harmonized decades of plant reproduction data for 141 woody plant species across nine Long-Term Ecological Research (LTER) or long-term ecological monitoring sites from a wide range of habitats across the United States. Plant reproduction data are reported annually between 1957 and 2021 and based on either seed-traps or seed and/or cone counts on individual trees. A wide range of woody plant species including trees, shrubs, and lianas are represented within sites allowing for direct community-level comparisons among species. We share code for filtering of data that enables the comparison of plot and individual tree data across sites. For each species, we compiled relevant life history attributes (e.g., seed mass, dispersal syndrome, seed longevity, sexual system) that may serve as important predictors of mast seeding in future analyses. To aid in phylogenetically-informed analyses, we also share a phylogeny and phylogenetic distance matrix for all species in the dataset. These data can be used to investigate continent-scale ecological properties of seed production, including individual and population variability, synchrony within and across species, and how these properties of seed production vary in relation to plant species traits and environmental conditions. In addition, these data can be used to assess how annual variability in seed production is associated with climate conditions and how that varies across populations, species, and regions.
README: Data from: Co-Mast: Harmonized seed production data for woody plants across U.S. long term research sites
https://doi.org/10.5061/dryad.69p8cz98q
Here we provide a dataset aggregating plant reproduction data on 141 species from 1957 to 2021 across nine environmentally disparate Long-Term Ecological Research (LTER) or long-term ecological monitoring sites in the United States. Our aim was to create a dataset that harmonizes different sampling methods used to estimate plant reproduction, enabling cross-species comparisons of reproduction across multiple species at sites and incorporating attributes of species such as reproductive cycle length, leaf longevity, and mode of dispersal. The data were filtered as described below to eliminate dubious seed production time series, including data from rare species, species whose identification was questionable, and seeds identified only to genus rather than species. These long-term records provide information on seed production that can be used to assess environmental drivers of mast seeding and community level synchrony. Further, when combined with the species phylogenetic and trait data provided here, these data can collectively be used to assess how these factors drive mast seeding across environmental gradients.
Description of the data and file structure
Description of data files:
- individual_seed_production.csv: This file contains measurements of annual seed production for woody plant species from all long-term research sites listed in Table 1. Measurements were taken at multiple individual sampling units (a seed trap or an observation of an individual tree) within a site. Data includes only observations that could be reliably attributed to a particular plant species. Each row contains an observed measurement of seed production for a particular year for a particular species within a sampling unit. We standardized plant taxonomy using the USDA PLANTS database (USDA NRCS, 2024). Seed production was measured in a variety of ways for data of different origins, and the methods of assessing seed production are described for each originating dataset separately under ‘Data Descriptors’. Note that counts for data collected via seed traps represent the reproductive structure count per trap while those collected via counts on individual plants represent the reproductive structure count per plant. Seed trap data were not linked to individual trees. In addition, not all sites collected individual count data with the same sampling effort. Therefore, comparisons of actual counts per individual should not be made across sites with differing methodologies. Methods are explained in the “methods_notes” column of this dataset.
- plot_summarized_seed_data.csv: This file contains data on seed production summarized to the plot level, using the individual seed production data described above using the “filter_for_data_paper.R” R script. Individual seed production data were filtered to ensure that subsequent analyses would not be biased due to low sampling efforts. To be included, the following criteria had to be met: (1) for seed trap data, the species was observed in ≥ 5% of seed traps; (2) seed production of the species was observed in a minimum of 10 years; (3) fewer than 80% of the years in each time series had 0 seeds/cones/acorns observed; (4) there were at least 4 years with non-zero data in the time series; and (5) for seed/cone count data, data were collected on at least 10 individual plants. These criteria resulted in 37 species being excluded from the 141 species in the original dataset (individual_seed_production.csv), resulting in 104 species in the filtered dataset.
- plot_locations.csv: This file contains the latitude and longitude for each plot in the dataset. Twenty-one of the plots from Andrews Forest did not have digitally available coordinates, and therefore were estimated based on old hand plotted maps, which is noted in the file.
- species_atrributes.csv: This file contains aggregated information about species attributes of the 104 plant species present in our filtered dataset. Each row contains observations of species-level attributes. Traits included were: (i) leaf longevity (deciduous, evergreen), ii) dispersal syndrome (abiotic, endozoochory, synzoochory), iii) fleshy fruit (yes, no), iv) growth form (tree, shrub, liana), v) mycorrhizal association (arbuscular, ectomycorrhizal, ericoid, none), vi) pollinator (animal, wind), vii) seed bank (yes, no), viii) seed development time (from bud differentiation to seed maturity: 1 year, 2 years, 3 years), ix) sexual system (dioecious, hermaphrodite, monoecious, polygamo-dioecious), x) shade tolerance (intolerant, intermediate, tolerant), xi) seed mass (mg; continuous variable), xii) leaf type (broadleaf, needleleaf). These data were obtained predominantly from the US woody seed manual (USDA, 2008), the Silvics of North America (Burns and Honkala, 2008), the TRY database (Kattge et al 2011; 2020), and the USDA Plant Database (USDA NRCS, 2024). Full citations can be found in “attribute-citations.csv”. The information of both plot_summarized_seed_data.csv and species_attributes.csv can be matched by joining by the ‘species_name’ column.
- attribute-citations.csv: This file contains all citations used in compiling the species attribute table.
- phylogeny.tr: This file contains a phylogenetic tree for the 104 plant species present in our filtered dataset. The phylogenetic tree was based on Zanne et al. (2014), and 94 of our species matched exactly their phylogeny. Of the remaining 10 species, eight matched a genus on the Zanne et al. (2014) phylogeny, and are placed as polytomies at the genus level. The remaining two species did not match at the genus level, and they were placed as polytomies at the family level. The R script “phylogenetic_tree.r” produced the phylogeny. An alternate method for constructing a phylogenetic tree based on Smith & Brown (2018) is also included at the end of the script.
Variable information for data files:
Table 1: Definitions and units of columns in the individual seed production data (individual_seed_production.csv), which is the unfiltered data.
Column name | Definition | Units |
---|---|---|
site_name | The name of the LTER/long-term ecological monitoring site | |
megaplot | This variable applies to data only from the site Andrews Forest (AND) and Cedar Creek (CDR) and is used to groups plots into “megaplots”. Megaplots are designated based on their proximity to each other in order to allow comparisons of cross-species synchrony. | |
plot | The name of the unique plot at each site that the data came from. Plots are designated by expert staff as individual forest stands that share environmental characteristics in a given location. This is the same as the megaplot name at all sites except AND and CDR. | |
trap | A unique identifier for the trap from which seeds were counted (only applicable to sites where seed traps were used). | |
plant_ID | A unique identifier for the plant on which seeds/cones were counted (only applicable to sites where seeds or cones on individual plants were counted). | |
species_name | The scientific name of the species. Names are written as Genus.species. | |
year | The year in which seeds/cones matured. See individual site descriptions for how this was determined at each site. | |
count | Count of reproductive structures (cones, seeds, acorns, etc.) in corresponding seed trap or tree/shrub canopy. | |
stem_diameter_cm | Diameter at breast height (DBH) of corresponding plant in corresponding year. Only applicable to AND. | cm |
trap_area_m2 | Area of seed trap where seeds were collected (only applicable to sites where seed traps were used). | m2 |
height_diameter_taken | Typically this will be at breast height (1.4 m), which is listed as "Breast Height" but if done at the root collar for species like pinyon or juniper then it’s listed as "Root Collar". Only applicable for sites that measured individual plant data. | |
burned | Only applicable to Cedar Creek (CDR) and denotes whether the plot had burned recently in prescribed burn treatments. We note whether the plot burned in the prior spring (burned=1), two years previously (burned =2) or has not burned within the last two years (burned=0). | |
general_method | Whether the data was collected with seed traps (“TRAP”) or with cone counts (“CONECOUNT”) or seed counts (“SEEDCOUNT”) on trees. For cone and seed counts that were only done in part of a tree canopy, then "PARTIALCONECOUNT" or "PARTIALSEEDCOUNT" is instead noted. If partial counts were done in tree canopies that were scaled to whole tree by a multiplier then we note that the cone or seed count is estimated, by stating "ESTIMATEDCONECOUNT" or, in the case of estimated seed counts, "ESTIMATEDSEEDCOUNT". If the method was done by counting seeds or cones in a set amount of time, then "TIMEDCONECOUNT" or "TIMEDSEEDCOUNT" is noted. | |
methods_notes | Any relevant notes on the method, such as viewing area for counting cones/seeds, type of seed trap used, time in which cones/seeds were counted if timed counts were used, etc. |
Table 2. Metadata for plot_summarized_seed_data.csv, which is plant reproduction data that met our filtering criteria, summarized at the plot scale. Sites either report data as cone counts or seed trap data, as indicated in Table 1.
Column Name | Description | Range/Levels |
---|---|---|
site_name | The name of the LTER/long-term ecological monitoring site | 9 sites, see Table 1 for site characteristics |
plot | Within each site, plots are designated by expert staff as individual forest stands that share environmental characteristics in a given location. | 96 plots, ranging from 1 - 53 at each site. |
megaplot | This variable applies to data from the sites Andrews Forest (AND) and Cedar Creek (CDR) and is used to groups plots into “megaplots”. Megaplots are designated based on their proximity to each other in order to allow comparisons of cross-species synchrony. | 61 megaplots, ranging from 1 - 24 at each site. |
species_name | The scientific name of species. Names are written as Genus.Species. Matches the “species_name" column in all other data files. | 104 woody plant species, full list of species is given in species_attributes.csv |
year | Year in which seeds/cones matured. See individual site descriptions for how this was determined at each site. | 1957-2021 |
total_trap_area_m2 | Total trap area (summed) in a given year at the plot scale, expressed in units of m2. Each trap has an area that it is collecting data from. | 0.1133-60.00 m2/plot |
seeds_per_m2 | Total number of seeds divided by total trap area in m2 per plot. In a given year. | 0.00-67,320.49 seeds/m2 |
total_seeds | Total number of seeds per plot in a given year | 0.00-1,120,213.00 seeds/plot |
total_traps | Total number of traps per plot. | 1-120 traps/plot |
seeds_or_cones_per_tree | Total number of seeds or cones per plot divided by the total number of trees. Only applies to sites that use individual seed/cone count data (see Table 1). | 0.00-1,588.89 seeds or cones/tree |
total_seeds_or_cones | Total number of seeds or cones per plot in a given year. Only applies to sites that use individual seed/cone count data (see Table 1). | 0.00-28,600.00 seeds or cones/plot |
total_trees | Total number of trees per plot that were included in seed or cone counts. Only applies to sites that use individual seed/cone count data (see Table 1). | 0-219 trees/plot |
collections_per_yr | The number of times per year that seed traps or individual seed production were counted. This varies by site. | 1-26 collections/year |
plot_lat | Latitude of plot in units of decimal degrees | 18.33° - 65.15° |
plot_long | Longitude of plot in units of decimal degrees | −148.36° - −65.82° |
Table 3. Metadata for “species_attributes.csv”, which lists all species included in the plot-level data along with some of their primary attributes or traits. Data predominantly came from the US woody seed manual (USDA, 2008), the Silvics of North America (Burns and Honkala, 2008), the TRY database (Kattge et al 2011; 2020), and the USDA Plant Database (USDA NRCS, 2022). Additional data sources that were used and each data source that populated the TRY database are provided in the CSV file: “attribute-citations.csv”. For all attributes we used expert knowledge from members of this team to ensure that data reported is consistent with observed species attributes in the field.
Name | Description | Range / Levels |
---|---|---|
species_name | The scientific name of each species, matches the "species_name" column in all other data files | |
family | The family of each species | |
genus | The genus of each species | |
epithet | The epithet of each species | |
seed_development_years | Time (in years) for seed development, from seed initiation to mature seed development in years (1 = 12 months or less; 2 = 13-24 months; 3 = 25 - 36 months). | 1-3 years |
pollinator_code | Primary pollination vector (animal or wind). Animal pollinators included insects and birds. | animal, |
mycorrhiza_type | The dominant type of plant-fungal mycorrhizal symbiosis for each species. | EM, AM, Ericoid, none |
needleleaf_broadleaf | Whether the leaf form is needleaf or broadleaf | Needleleaf, Broadleaf |
deciduous_evergreen | Whether the species is deciduous or evergreen | Deciduous, Evergreen |
seed_maturation_timing | Time period of seed maturation. These were broken into dominant season(s), based on the month(s) of reported seed maturation. Seasons were defined as: Summer (June, July, August), Fall (September, October, November), winter (December, January, February), and Spring (March, April, May), with late summer being included as an additional category since many species reproduced during the August-September months. Some species reproduce in multiple seasons (e.g. Summer - Fall) and others can reproduce at multiple distinct times of the year (e.g. Spring and Fall). If a species reproduced across seasons, then only the season where most reproduction occurred is noted. | Fall, |
seed_mass_mg | Average seed mass (in mg per seed). | 0.019 - 6044 mg |
sexual_system | The production of pollen and ovules, when produced on separate individuals (Dioecious), or the same individual but in separate structures (Monoecious), or within the same structure (Hermaphrodite). Polygamodioecious refers to species that can have both single-sex and bisexual flowers on the same individual. | Monoecious, Dioecious, Hermaphrodite, polygamo-Dioecious |
shade_tolerance | A species shade tolerance level, where Tolerant = shade tolerant; Intermediate = intermediate shade tolerance; and Intolerant = shade intolerant | Tolerant, Intermediate, Intolerant |
growth_form | The species dominant growth form | Tree, Shrub, Liana |
seed_bank | If seeds can remain viable for over a year, either in the canopy or in the soil, then the species is considered to seed bank and given a “yes”. | yes, no |
fleshy_fruit | Ovary wall succulent, not hard and dry | yes, no |
dispersal_syndrome | The mechanisms by which seed dispersal occurs. Abiotic includes wind, water, and gravity dispersal mechanisms whereas animal dispersal mechanisms are those broken down by endozoochory and synzoochory. | abiotic, endozoochory, synzoochory |
Table 4. Metadata for plot_locations.csv, which provides the latitude and longitude of each plot in the ‘individual_seed_production.csv’ file.
Column Name | Description | Range/Levels |
---|---|---|
site_name | The name of the LTER/long-term ecological monitoring site | 9 sites, see Table 1 for site characteristics |
plot | Within each site, plots are designated by expert staff as individual forest stands that share environmental characteristics in a given location. | 105 plots, ranging from 1 - 61 at each site. |
megaplot | This variable applies to data from the sites Andrews Forest (AND) and Cedar Creek (CDR) and is used to groups plots into “megaplots”. Megaplots are designated based on their proximity to each other in order to allow comparisons of cross-species synchrony. | 63 megaplots |
Latitude_dd | Latitude of plot in units of decimal degrees | 18.33° - 65.15° |
Longitude_dd | Longitude of plot in units of decimal degrees | −148.36° - −65.82° |
notes | Notes about how coordinates were acquired, if not originally present in the data | NA or “coordinates estimated from hand plotted map” |
Description of LTER or long-term monitoring sites from which data were gathered:
Adirondack Ecological Center - AEC
Stacy A. McNulty (smcnulty@esf.edu), Raymond D. Masters
Adirondack Ecological Center is a long term ecological monitoring site (https://www.esf.edu/aec/research/altemp.php). Seed traps (18.9 L plastic buckets, surface area of 0.0729 m2) were collected biannually in the spring and fall at Huntington Wildlife Forest at The State University of New York College of Environmental Science and Forestry. Fifty buckets were placed 30 m apart from one another and 0.5 m off the ground using metal stakes in a 350-year-old unmanaged forest with two forest types, deciduous and mixed conifer/deciduous (25 buckets in each forest type). Material from buckets was sorted, seeds were identified down to species, and yearly totals were recorded for each bucket in each season. For our data compilation, we tallied seeds in both spring and fall counts in the same calendar year for species that disperse their seeds in the spring, and tallied the fall bucket of the current year and spring bucket of the following year for species that disperse their seeds in the fall. The raw data can be found in McNulty and Masters (2019).
Andrews Forest - AND (US LTER)
Mark D. Schulze (mark.schulze@oregonstate.edu), Jerry F. Franklin
Sites across the Cascade Mountains in Oregon including Andrews Forest (AND) and Washington (plus a coast range site) were selected in the 1960s, with a few sites added in later years. Plots are single species samples of 20-30 marked trees. In a few cases, “plots” overlap spatially, as two species in the same forest stand were selected but called separate plots. Plots were selected to cover the range of the Cascade Mountains, and trees within plots were selected to be dominant or codominant individuals with good viewing angles. Trails, roads, natural vegetation breaks and clearcut edges were used to provide good viewing angles. Data are reported as the number of cones for each individual. Observation start and end years vary among plots for several reasons: plots were added to the study in several pulses, with most beginning 1962-1965, but with a subset in the 1970s and 1980s. Plots were dropped due to major wildfires, volcano eruptions and, beginning in 2018, funding shortages. Occasional missed years for a given plot result from access issues due to active major wildfires in the area. Some plots have experienced significant mortality, and new trees have only been added sporadically as funding allows, meaning the number of trees observed is not constant over time. The raw data and metadata can be found in Franklin and Schulze (2023).
Bonanza Creek - BNZ (US LTER)
Jill Johnstone (jfjohnstone@alaska.edu), Keith Van Cleve, F. Stuart Chapin, Roger Ruess, Michelle C. Mack
Seed production data from Bonanza Creek (BNZ) capture annual variation in seed fall within forest stands. Seed traps (0.5m x 0.5m wooden frames with mesh liners) were placed to collect fallen seed under trees at a site, with traps deployed along transects within a 50m x 60m site. Samples from 1987 onward are based on 3 traps deployed along 2 transects (6 total per site). Seeds were collected in the spring, following snowmelt, and the larger seeds of woody trees and shrubs were counted. Data are reported as the count of seeds in individual traps for the seed production year prior to the collection year, since the monitored species disperse their seeds in fall and winter. Seeds of tree species Populus balsamifera and P. tremuloides, which may co-occur at sites, are not included in the counts as their seeds are very small and dispersed in spring rather than fall. Sites at BNZ were established to represent different successional stages along a hypothesized sequence of floodplain (primary) and post-fire (secondary) succession. The majority of sites have undergone successional changes in canopy structure over the course of several decades. The period of seed collection began in 1985 or later for most (8) sites. Seed collection for Picea glauca began in 1957 at site UP1A and 1969 at UP3A, with varying numbers of seed traps used over time up to 1987. Site UP1A burned in 1983 but seed collection was re-established at the site a few years later. Sites with historic seed collection of Picea glauca during the 1970’s (FPSH and TS04) were not maintained after program reorganization in the mid-1980’s. The raw data and metadata can be found in Van Cleve et al. (2022).
Cedar Creek Ecosystem Science Reserve - CDR (US LTER)
Walt Koenig (wdkoenig@berkeley.edu) and Johannes M. H. Knops (Johannes.Knops@xjtlu.edu.cn)
Data are 30 second visual counts of the number of acorns counted on individual trees at Cedar Creek (CDR), following the protocol of Koenig et al. (1994). Plots were burned in the spring on a schedule that varied among plots, which has been shown to affect acorn production (Funk et al. 2016). The data include whether the plot was burned the prior spring (burned=1); 1 year previously (burned.m1yr=1); or 2 years previously (burned.m2yr=1). The raw data and metadata can be found in Knops (2018).
Coweeta - CWT
Jim Clark (jclark@duke.edu), Inés Ibáñez
Seed traps were located at seven forest stands at Coweeta (CWT) at regular intervals. Traps were emptied between 1 and 6 times per year at each plot. For each species, seeds per year were summed by adding the number of seeds found in traps after the month of typical seed maturation in the current year, with all seeds found in traps the following year prior to the month of typical seed maturation (e.g. for a species whose seeds mature in June, the seed count in 2001 would include seeds collected from June 2001 - May 2002). Some seeds were only identified to the genus level, and because of that species-level data is not available for all species, including some of the dominant species. The raw data were provided by Inés Ibáñez.
Harvard Forest - HFR (US LTER)
Elizabeth Crone (ecrone@ucdavis.edu), Joshua Rapp, Kristina Stinson
Reproduction is evaluated at both the fruiting and flowering stages for Acer saccharum at Harvard Forest (HFR). During flowering (April-early May) flowering effort was qualitatively evaluated by number of flowering buds (low: < 1,000, medium: 1,000-10,000, high: > 10,000). Trees were also recorded as having only male flowers or having both female and male flowers. Whole tree seed production was evaluated by visual timed counts of seeds across the canopy, with two observers counting the number of seeds observed in 15 seconds, and the summed count of these used as a metric for total reproduction for that individual. The raw data and metadata can be found in Rapp et al. (2023).
Hubbard Brook - HBR (US LTER)
Nat Cleavitt (nlc4@cornell.edu), Tim Fahey
Adjacent to and within the south-facing watershed area of Hubbard Brook (HBR), seeds are collected via basket style seed traps (0.1 m2 area) elevated on fence posts 1m above the ground. Seeds, leaves and other fine litter are collected, sorted, and counted three times per “year” (August, November and May of the following calendar year). There are a total of 10 plots with 10-12 baskets at each plot. Plots are distributed across three geographic areas, and stratified by elevation zones (low, mid, upper and high). Two of the geographic areas are reference (or control areas) started in 1993 and the third is a Calcium silicate addition treatment started in 1996 (Cleavitt and Fahey 2017). The raw data and metadata can be found in Fahey and Cleavitt (2021).
Luquillo - LUQ (US LTER)
Jess Zimmerman (jesskz@ites.upr.edu)
At the El Verde Field Station, 120 numbered baskets were placed along trails in the 16 ha Luquillo Forest Dynamics Plot (LUQ). Fern and angiosperm flowers and fruits are monitored biweekly. The data from LFDP began in April of 1992. Originally, traps measured 0.16 m2 but were replaced in 2006 with traps 0.5 m2. Each reproductive part collected is counted and identified to species using a six-letter code. Reproductive parts are identified with a number code. Counts are summed for each calendar year. The raw data and metadata can be found in Zimmerman (2022).
Sevilleta - SEV (US LTER)
Roman I. Zlotin (deceased), Diana S. Macias (dianamacias@berkeley.edu), Robert R. Parmenter (parmentr@unm.edu),
Annual mast fruit production is measured in August at five sites within the Sevilleta National Wildlife Refuge (SEV), beginning in 1997. Three different methods were developed to estimate annual production. For piñon pine (Pinus edulis) estimates are made by visually counting the third-year, ripened, mature cones per tree (n = 210 marked trees) with binoculars and multiplying the # of cones by the mean number of intact seeds per cone to estimate seeds per tree; for Sonoran scrub oak (Quercus turbinellla) estimates utilize the number of acorns per 0.1 m2 of canopy surface area in 3-5 replicates, and scaling up to the size of the entire individual (n = 194 marked trees); and for one-seed juniper (Juniperus monosperma) the percent of twigs with berries and the quantity of berries per twig are determined every year for all trees in each plot (n = 412 trees) (Parmenter et al. 2018). The raw data and metadata can be found in Zlotin (2016).
Data harmonization methods
- Site selection
We used data from nine LTER and related long term study sites on the EDI data portal. The LTER network was established in 1980 to study ecological processes that are best captured by sustained observation across a broad range of ecosystem types. We selected LTER sites with datasets on woody plant reproduction, including records of direct visual counts of reproduction on trees and seed traps; we also focused on datasets with multiple years of seed production data. We exclusively use sites with a long-term nature of the data and standardized observations over time at individual sites.
2. Data exclusion criteria
* Adirondack Ecological Center
Since conifer seeds were not identified to species until 2006, there is a species in the original data called “sp_fi_he_total”. This is not a true species, but rather the count of all conifer seeds collected from the seed traps each year. We inspected the seed count data for each year and removed any conifer species that had many zeros in the years prior to 2006 (Abies balsamea and Thuja occidentalis). We also removed Picea rubens, as it has seeds that look similar to Abies balsamea. We retained Tsuga canadensis in the data as it had large numbers of seeds recorded outside of the mixed conifer category prior to 2006 and therefore was presumed to have a reliable seed count throughout the record. Though some seeds may still have been counted in the conifer category, enough were separated out to identify trends in masting.
* Andrews Forest
For analyses focused on community-level masting, plots at Andrews Forest can be grouped into “megaplots” based on their proximity to each other (designated with expert knowledge by Dr. Mark Schulze). A megaplot is considered an ecological population due to their proximity to each other (range 0-20 km, mean 7.7 km) and their similar climate and topographic characteristics. As such, we feel these plots can be analyzed together to assess questions related to synchrony among species.
* Bonanza Creek
Plot “HR1A” was removed from the data due to the shortness of the time series at this plot and uncertainty in identification of seeds. Plots "BF79","BF81","BF84","BF86", and "FP5C" were also removed as they were very early successional and only intended to document dispersal of seed into newly disturbed sites. Entries labeled “alnus” were converted to Alnus incana in plots “FP1A”, “FP2A”, and “FP3A”, and to Alnus viridis in all other plots based on Dr. Jill Johnstone’s expert knowledge of which species were present at each plot. Entries labeled “betula” were identified as Betula neoalaskana. These changes were made based on site knowledge of Jill Johnstone. There were also some (n=4) duplicate entries (same plot, trap, year, and species, but different counts) in the dataframe for Bonanza Creek. In this case, counts were summed to derive one seed count.
* Cedar Creek
All data were used, but noted where burning treatments occurred. Plots are close in proximity (see plot_locations.csv) and thus for some analyses users may want to lump plots together as one “megaplot”.
* Coweeta
The Coweeta data originally included counts of seeds, fruits, and flowers. We removed all flower counts from the data. We then convert fruit counts into seed counts using the average number of seeds per fruit for each species, based on the Silvics of North America handbook and expert site knowledge (Burns & Honkala, 1990). Because of the high diversity of tree species at the Coweeta site, the lack of species-level identification for some seeds, and the sparsity of data for some species within seed traps we filtered data based on the following criteria. Records for unidentified species (e.g. Quercus sp.) were removed from the dataset, unless its species identity could be inferred from other information. This criterion included excluding occasional records at the species level, if that taxon was typically identified only at the genus level. Records for very rare species were also removed from the dataset (e.g. Fraxinus caroliniana was removed because it was observed in only a single seed trap). Records for species typically shorter than seed traps (e.g. Hamamelis virginiana) were removed from the dataset. Records from species not present at the site, but abundant in adjacent properties (e.g. Pinus sp.) were removed from the dataset. This resulted in the inclusion of 20 species at this site.
Table 5. Changes made to species identities based on local knowledge from Inés Ibáñez (Coweeta). The names from the original data are in the “original” column, the names they got changed to are in the “new” column and the reason for doing so is in the “Reason” column.
Site | Original | New | Reason |
---|---|---|---|
CWT | *Acer *spp | Acer rubrum | Prior to 2010 Acer seeds were pooled at the genus level in the seed trap data. Because most of these seeds were Acer rubrum, we pooled them with Acer rubrum for analysis. For reference, the maximum seeds/trap after 2010 at these sites was 0.1 for Acer saccarum and 0.25 for Acer pensylvanicum vs. 30 for Acer rubrum) |
CWT | Acer pensylvanicum | Removed | Very low abundance and not identified to species on a regular basis. |
CWT | Acer saccharum | Removed | High-elevation plots with substantial Acer saccharum abundance were added in 2001. Acer saccharum seeds were rare in the low-elevation plots that were monitored prior to 2001 and was not consistently identified to species. |
CWT | Amelanchier spp | Amelanchier arborea | This was the only Amelanchier species at the site based on stand structure data. |
CWT | Betula spp, Betula alleghaniensis & Betula lenta | Removed | These species could not be resolved/separated based on stand structure data. |
CWT | Carya spp | Carya glabra in plots CWT_318 and CWT_427, otherwise removed | Identified to species in these plots based on stand structure data; these were the plots with only one species. |
CWT | Ilex spp | Ilex montana in plots CWT_427, CWT_527, CWT_LG & CWT_UG, otherwise removed | Identified to species in these plots based on stand structure data; these were the plots with only one species. |
CWT | Magnolia spp | Magnolia acuminata in plots CWT_318 and CWT_527, otherwise removed | Identified to species in these plots based on stand structure data; these were the plots with only one species. |
CWT | Morus spp | Removed | This species had only a few individual trees. Our guess is that the seed trap data reflect only one tree (I. Ibanez, pers. comm.) |
CWT | Pinus spp & Pinus rigida | Removed | There is a nearby pine plantation where these seeds most likely came from. |
CWT | Prunus spp | Removed | This species had only a few individual trees. Our guess is that the seed trap data reflect a small number of individuals (I. Ibanez, pers. comm.) |
CWT | All Quercus | Removed | These species could not be resolved/separated based on stand structure data. |
CWT | Ulmus spp. | Removed | This genus is rare/absent at Coweeta but could be confused with Betula. |
CWT | Vitis spp. | Removed | Not identified to species in seed traps, and not in stand structure data because it is a vine. |
CWT | Viburnum spp. | Removed | Not identified to species in seed traps, and not likely to be fully represented in stand structure data (often a small shrub). Possibly not consistently identified in seed trap data (I. Ibanez, pers. comm.) |
3. Data Filtering and Summarization at the Plot Level
Data from four sites (AND, CDR, SEV, HFR) were based on counts of reproductive structures on marked individuals and the remaining sites (AEC, BNZ, CWT, HBR, LUQ) employed seed trap collections. Datasets were aggregated to the plot level within sites for each species and year (mean reproduction per tree or mean seeds per trap). In addition to the inclusion criteria above, in consultation with LTER data leads in the authorship group, we excluded seed trap datasets for species that were extremely rare at the site and/or were rarely captured in the seed traps, such that separation of seed production from sampling noise would not be possible. We screened tree count datasets for consistency of sample sizes over time, and plots where samples declined below 10 individuals were not included in our summarized data at the plot-level. The R-script that does this filtering and summarization is provided (filter_for_data_paper.R). In the summarized data provided here, a single time series is an ordered set of seed production data (count or seed trap data) across years, for one species at one plot.
Sharing/Access information
Data were derived from the following sources:
- Burns, R. M., and B. Honkala. 1990. Silvics of North America. Agriculture Handbook (Washington).
- Fahey, T. and N. Cleavitt. 2021. Tree Seed Data at the Hubbard Brook Experimental Forest, 1993 - present ver 2. Environmental Data Initiative. https://doi.org/10.6073/pasta/3d6b29aa80b150e5a9e28a839c05c211 (Accessed 2023-12-27).
- Franklin, J.F. and M.D. Schulze. 2023. Cone production of upper slope conifers in the Cascade Range of Oregon and Washington, 1959 to 2022 ver 16. Environmental Data Initiative. https://doi.org/10.6073/pasta/834405bb4b14582a8f444011f9158740 (Accessed 2023-03-28).
- Kattge, J., Bönisch, G., Díaz, S., Lavorel, S., Prentice, I.C., Leadley, P., et al. (2020). TRY plant trait database - enhanced coverage and open access. Glob. Change Biol., 26, 119–188.
- Kattge, J., Díaz, S., Lavorel, S., Prentice, I.C., Leadley, P., Bönisch, G., et al. (2011). TRY - A global database of plant traits. Glob. Change Biol., 17, 2905–2935.
- Knops, J. 2018. Acorn production:Acorn survey ver 8. Environmental Data Initiative. https://doi.org/10.6073/pasta/f856dc4ef3e1ea586bcfb841be7a4700 (Accessed 2024-01-19).
- McNulty, S.A. and R.D. Masters. 2019. Seed Production Survey, 1988-2009, Adirondack Long-Term Ecological Monitoring Program Project No. 26 by Adirondack Ecological Center of the State University of New York College of Environmental Science and Forestry, Newcomb, New York, USA ver 1. Environmental Data Initiative. https://doi.org/10.6073/pasta/f28fe27b04d069dd1f9b4de45488bd8e (Accessed 2024-01-19).
- Rapp, J., E. Crone, and K. Stinson. 2023. Maple Reproduction and Sap Flow at Harvard Forest since 2011 ver 6. Environmental Data Initiative. https://doi.org/10.6073/pasta/7c2ddd7b75680980d84478011c5fbba9 (Accessed 2024-01-19).
- USDA NRCS, 2024. The PLANTS Database (http://plants.usda.gov, 01/12/2024). National Plant Data Team, Greensboro, NC USA.
- USDA, 2008. The Woody Plant Seed Manual. United State Department of Agriculture, Fort Service, Agriculture Handbook 727.
- Van Cleve, K., F.S. Chapin, R. Ruess, M.C. Mack, and Bonanza Creek LTER. 2022. Bonanza Creek LTER: Yearly Seedfall Summary from 1957 to Present in the Bonanza Creek Experimental Forest near Fairbanks, Alaska ver 30. Environmental Data Initiative. https://doi.org/10.6073/pasta/373ca46c1df26dc4145bd21ab7e7bb88 (Accessed 2024-01-19).
- Zanne, Amy E. et al. (2014). Data from: Three keys to the radiation of angiosperms into freezing environments [Dataset]. Dryad. https://doi.org/10.5061/dryad.63q27
- Zimmerman, J. 2022. Phenologies of the Tabonuco Forest trees and shrubs ver 559507. Environmental Data Initiative. https://doi.org/10.6073/pasta/0fd0832f8619151ab22c8c212357c1c4 (Accessed 2024-01-19).
- Zlotin, R. 2016. Tree Mast Production in Pinyon-Juniper-Oak Forests at the Sevilleta National Wildlife Refuge, New Mexico (1997- present) ver 154836. Environmental Data Initiative. https://doi.org/10.6073/pasta/f6cb97e094966c0af30206e767b0b2c2 (Accessed 2024-01-19).
Code/Software
filter_for_data_paper.R - this is an R script that filters data according to five criteria:
- for seed trap data, the species was observed in ≥ 5% of seed traps
- seed production of the species was observed in a minimum of 10 years
- fewer than 80% of the years in each time series had 0 seeds/cones/acorns observed
- there were at least 4 years with non-zero data in the time series
- for seed/cone count data, data were collected on at least 10 individual plants.
These criteria resulted in 37 species being excluded from the 141 species in the original dataset (individual_seed_production.csv), resulting in 104 species in the filtered dataset. Datasets were then aggregated to the plot level within sites for each species and year (mean reproduction per tree or mean seeds per trap). The summarized data is then written to the csv "plot_summarized_seed_data.csv", which is also included as a csv in this dataset. This script requires the packages readr v2.1.5, dplyr v1.1.4, and ggplot2 v3.4.4 and was run on R version 4.3.1.
phylogenetic_tree.r - this is an R script that creates the phylogenetic tree for species in the dataset. This script requires the packages ape v5.7-1 and phytools v2.1-1, as well as the data from Zanne et al. (2014).
Methods
Data on plant reproduction were either downloaded directly from the Environmental Data Initiative Data Portal or recieved from a scientist affiliated with a Long-Term Ecological Research (LTER) site. Data on species attributes were compiled from various sources, referenced in the file called "attribute-citations.csv". Our aim was to create a dataset that harmonizes different sampling methods used to estimate plant reproduction, enabling cross-species comparisons of reproduction across multiple species at sites and incorporating attributes of species such as reproductive cycle length, leaf longevity, and mode of dispersal. The data were filtered as described in the metadata to eliminate dubious seed production time series, including data from rare species, species whose identification was questionable, and seeds identified only to genus rather than species.