Data from: Underrepresentation of dietary specialist larval Lepidoptera in small forest fragments: Testing alternative mechanisms
Data files
Mar 04, 2025 version files 379.06 KB
-
Mickley.etal.2025.JAE.zip
368.34 KB
-
README.md
10.72 KB
Abstract
Growing evidence suggests that organisms with narrow niche requirements are particularly disadvantaged in small habitat patches, typical of fragmented landscapes. However, the mechanisms behind this relationship remain unclear. Dietary specialists may be particularly constrained by the availability of their food resources as habitat area shrinks. For herbivorous insects, host plants may be filtered out of small habitat fragments by neutral sampling processes and deterministic plant community shifts due to altered microclimates, edge effects, and browsing by ungulates. We examined the relationship between forest fragment area and the abundance of dietary-specialist and dietary-generalist larval Lepidoptera (caterpillars) and their host plants in the northeastern USA. We surveyed caterpillars and their host plants over three years in equal-sized plots within 32 forest fragments varying in area between 3 and 1014 ha. We tested whether the abundances and species richness of dietary specialists increased more than those of dietary generalists with increasing fragment area, and, if so, whether the difference could be explained by reduced host plant availability or increased browsing by white-tailed deer (Odocoileus virginianus). The overall abundance of dietary specialists was positively related to fragment area; the relationship was substantially weaker for dietary generalists. There was notable variation among species within diet breadth groups, however. There was no effect of fragment area on the diversity of dietary-specialist or dietary-generalist caterpillars. Deer activity was not related to the abundances of either dietary-generalist or dietary-specialist caterpillars. Plant community composition was strongly associated with fragment area. Larger fragments were more likely to include host plants for both dietary-specialist and dietary-generalist caterpillars. Deer activity was correlated with decreased host plant availability for both groups, with a slightly stronger impact on host plants of dietary specialists. Although dietary specialists were more likely to lack host plants in fragments, the relationship between fragment area and host availability did not depend on caterpillar diet breadth. This study provides further evidence that decreasing patch area disproportionately impacts specialist consumers. Because this relationship was derived from equal-sized plots, it is robust to some criticisms levelled at fragmentation research. The mechanisms for specialist consumer declines, however, remain elusive.
https://doi.org/10.5061/dryad.k3j9kd5k8
Description of the data and file structure
Data were collected from 32 forest fragments of varying size (3 - 1014 ha) in Connecticut, USA. Within each fragment, we established:
- 3 10 x 10 m vegetation sampling plots, arranged 25 apart on a triangle. All woody stems > 1 m tall were measured (diameter at breast height or height) and identified to species.
- 4 5 x 5 m caterpillar sampling plots on the edges of each vegetation plot (12 total per site). These caterpillar plots were sampled for larval Lepidoptera during June of each year 2017 - 2019, by striking branches between 1 - 2 m tall within the plots.
- Deer activity in the hectare surrounding the vegetation and caterpillar plots. We recorded pellet densities and proportion of browsed seedlings within 25 1 m^2^ plots on a 20 m grid. Additionally, we used 5 camera traps deployed for 2 weeks each (January - March 2018) to record deer activity within each site.
- Missing data given as “NA”
Files and variables
File: Mickley.etal.2025.JAE.zip
Description: This zip file contains 7 comma-separated files.
Missing values indicated with NA.
- catsurveys.csv: Caterpillar surveys conducted between 2017 and 2019 and preliminary data collected in 2015 (from a subset of sites). The data has 12182 rows of 41 variables.
- CPID: unique ID.
- Year: sampling year.
- SurveyDate: Date of sample (mm/dd/yyyy).
- BlockID: Spatial block of site (sites were arranged in blocks of 2-3 fragments of contrasting size).
- SiteID: ID of individual fragment.
- FragSize: Area of site core area in hectares.
- SizeClass: Discrete classification of site area (small, medium, large).
- ForestProp1km: Proportion of surrounding area within 1 km of site that is forested.
- FragRatio1km: The ratio of fragmented (edge, patch, perforated) forest to core forest in the 1km surrounding a site.
- ForestProp1kmRadius: Proportion of area within 1 km of center point of our plots in a site that is forested (including the site itself).
- ForestRatio1kmRadius: The ratio of fragmented (edge, patch, perforated) forest to core forest in the 1km from the center point of our plots within a site (including the site).
- Hunted: whether or not hunting is permitted at the site.
- BrowseProb: Estimated probability of a generic seedling being browsed in the site (calculated using GLMMs).
- PlotID: ID of individual caterpillar sampling plot (format: 2-letter-block-ID_3-letter-SiteID-3-letter-PlotID)
- CornerFlag: Location of corner flag
- Latitude: Plot latitude in decimal degrees.
- Longitude: Plot longitude in decimal degrees.
- HostID: 5-letter code for plant species that was sampled (3-letter-genus-code and 2-letter species code).
- Host species: Scientific name of plant.
- HostFamily: Family of host plant.
- HostGenus: Genus of host plant.
- HostSpecies: Species epiphet of host plant.
- HostAbund: Number of individual of host plant within caterpillar plot.
- IndNum: Individual ID of plant sampled within each plant species.
- BranchNum: Number of branch being sampled on plant.
- BranchLength: Length of branch in milimeters (not all branches were measured).
- BranchDiam: Diameter of branch in milimeters (not all branches were measured).
- NumLeaves: Number of leaves on branch (not all branches were measured)
- CatID: Unique 6 letter species code for each caterpillar collected, made up of first 3 letters of genus and first 3 letters of species. If no caterpillars were collected on the branch, filled with an NA (missing data).
- ScientificName: Full scientific name of caterpillar (if collected).
- Count: number of individuals of that caterpillar species collected on the branch.
- BadHost: Expert opinion on whether the caterpillar was unlikely to eat the hostplant (yes). Otherwise NA (missing data).
- Family: Family of caterpillar collected.
- Genus: Genus of caterpillar.
- Species: species epithet of caterpillar collected.
- MicroLepidoptera: Does caterpillar belong to a family classed as microlepidoptera (yes) or not (no).
- Specialist: Caterpillar is classed as a dietary specialist.
- WtMPD: The abundance weighted mean phylogenetic distance measured across all hosts on which that lepidopteran species was collected in study.
- Records: Total number of records of that species in study.
- RearID: Identification number for individuals brought to the lab for rearing to aid identification and molecular barcoding.
- Notes: any additional comments.
- deer_abundance.csv: Number of deer observed during 2 weeks of camera trap deployments between January - March 2018.
- site: Site code.
- Deer_captures: number of deer observed.
- deer_browse.csv: Data on numbers of seedlings within plots that were browsed by deer.
- Year: Year of observation (2017 or 2018).
- SurveyDate: Date of observation (mm/dd/yyyy)
- BlockID: Unique code for block.
- SiteID: Unique code for site.
- Block: Full block name.
- Site: Full site name.
- FragSize: Area of site core area (ha).
- Hunted: Is hunting allowed at site (yes) or not (no).
- ForestProp1km: Proportion of surrounding area within 1 km of site that is forested.
- FragRatio1km: The ratio of fragmented (edge, patch, perforated) forest to core forest in the 1km surrounding a site.
- PointID: Unique ID of plot.
- Latitude: Plot latitude.
- Longitude: Plot longitude.
- PlantID: Species ID of plant.
- ScientificName: Full scientific name of plant.
- Genus: Plant genus.
- Species: Plant species.
- Family: Plant Family.
- CommonName: Common name of plant.
- ScatPiles: Number of scat piles within 1 m^2^ plot.
- Plants: Number of woody plants observed within 1 m^2^ plot.
- Browsed: Number of woody plants with evidence of browsing within plot.
- Diet_categorization.csv: Data on diet categorization of lepidopteran species with 189 observations and 12 columns.
- SpeciesID: Unique 6-letter lepidopteran species code.
- ScientificName: Full scientific name.
- Family: Taxonomic family.
- Genus: Taxonomic genus.
- Species: Taxonomic species.
- BOLDID: Barcode Index Number (BIN) from Barcode of Life Database.
- CommonName: Common (English) name of species.
- MicroLep: Is the species considered microlepidoptera (yes) or not (no)
- Specialist: Is the species considered a specialist (yes) or not (no).
- wtMPD: Abundance-weighted mean phylogenetic distance among observed hosts.
- Records: Number of records of species in dataset.
- Hosts: Number of hosts from which species was collected.
- sites.csv: Summary data on sites with 32 observations and 18 columns
- STID: Site number.
- BlockID: Unique block identifier.
- SiteID: Unique site identifier.
- Block: Block name.
- Site: Site name.
- FragSize: Area of site core area in hectares.
- ForestProp1km: Proportion of surrounding area within 1 km of site that is forested.
- FragRatio1km: The ratio of fragmented (edge, patch, perforated) forest to core forest in the 1km surrounding a site.
- ForestProp1kmRadius: Proportion of area within 1 km of center point of our plots in a site that is forested (including the site itself).
- FragRatio1kmRadius: The ratio of fragmented (edge, patch, perforated) forest to core forest in the 1km from the center point of our plots within a site (including the site).
- CoreConnected: Connectance of site to other core forest.
- SizeClass: Discrete classification of site area (small, medium, large).
- Hunted: Whether or not hunting is permitted at the site.
- BrowseProb: Probability a generic seedling would be browsed by deer.
- ScatPredict: Predicted density of deer scat in a 1 m^2^ area.
- Latitude: Site latitude in decimal degrees.
- Longitude: Site longitude in decimal degrees.
- SiteNotes: Additional notes on site.
- species.csv: Summary data on each lepidopteran and plant species (and one group of Hymenoptera) encountered in project with 15 columns 382 observations.
- SPID: Species number.
- SpeciesID: Unique species 6-letter identifier.
- ScientificName: Species full scientific name.
- Taxon: Major taxonomic group (Plantae, Lepidoptera or Hymenoptera).
- Family: Taxonomic family.
- Genus: Taxonomic genus.
- Species: Taxonomic species epiphet.
- Infraspecific: Taxonomic sub-species when relevant.
- BOLDID: Barcode Index Number (BIN) from Barcode of Life Database, when relevant.
- CommonName: Species common name.
- MicroLep: Is species considered microlepidoptera (yes) or not (no).
- Specialist: Is the species considered a specialist (yes) or not (no).
- wtMPD: Abundance-weighted mean phylogenetic distance among observed hosts (for Lepidoptera only).
- Records: Number of records of species in dataset.
- Hosts Notes: Additional notes.
- vegetation_plots.csv: Data on woody plant composition in 100 m^2^ vegetation plots. Includes the plots adjacent to the caterpillar plots (suffixes 2-4 after site code in PointID) and additional sites not used in the final analysis. Includes 13845 observations and 14 columns.
- BlockID: Unique block identifier.
- SiteID: Unique site identifier
- PointID: Unique vegetation plot identifier.
- Latitude: Plot latitude in decimal degrees.
- Longitude: Plot longitude in decimal degrees.
- Year: Year vegetation was surveyed.
- SurveyDate: Date of vegetation survey (mm/dd/yyyy)
- Tree: Scientific name of plant.
- TreeGenus: Taxonomic genus.
- TreeSpecies: Taxonomic species.
- TreeFamily: Taxonomic family.
- DBH: Diameter at breast height (cm). Missing if stem < 1.3 m tall.
- Height: Height of tallest stem in meters. Missing if DBH measured.
- Dead: Is the plant dead (yes) or alive (no).
Code/software
Data can be viewed in any text editor or spreadsheet program. All files were processed using R.
Complete code for running the analyses in the paper are available at https://doi.org/10.5281/zenodo.14847790
Access information
Other publicly accessible locations of the data:
Data on abundances of moth caterpillars were collected from 32 forest fragments in eastern Connecticut USA in June of 2017 - 2019. At each site
1) Lepidopteran caterpillars were collected by beating branches over a white sheet within 12 25 m^2 plots.
2) Vegetation was sampled in 3 100 m^2 vegetation plots adjacent to the caterpillar plots.
3) Deer activity was assessed in the hectare centered on the caterpillar plots using a combination of pellet densities and browsing incidence on woody saplings within 25 1 m^2 plots at 20 m intervals and 5 camera traps (deployed at each site for 2 weeks during January - March 2018).
The files here provide the raw data on caterpillar counts within the 12 caterpillar plots and woody plant abundances within the 3 vegetation plots.
Data were analyzed using generalized mixed effects models. Code is provided on GitHub (DOI: 10.5281/zenodo.14847790).