Data for beta diversity analysis of insect herbivory evolution
Data files
Jan 24, 2025 version files 7.23 MB
-
Dataset_S1.zip
45.44 KB
-
Dataset_S2.zip
922.76 KB
-
Dataset_S3.zip
6.25 MB
-
README.md
9.91 KB
Abstract
Modern ecosystems display complex associations of plants-insects that underwent a long evolutionary process since the appearance of mid-Paleozoic vascular plants. Although several major hypotheses explain the evolution of these plant–insect associations, the initial pattern of modern insect herbivory is poorly understood. To understand the antiquity of modern patterns of terrestrial arthropod herbivory, functional feeding group–damage type (FFG-DT) data were used to analyze a 305-million-year interval from Late Pennsylvanian to present, in which 134 plant assemblages were used to assess turnover (replacement of some species by other species between sites) and nestedness (difference in composition when no species are replaced between sites) in pairwise comparisons of DTs. Results of beta diversity analyses indicate the prototype pattern for modern insect herbivory was established on gymnosperm-dominated plant assemblages by late Middle Jurassic, antedating angiosperm dominance by 60 million years. Turnover among plant groups and FFGs declined in earlier late Paleozoic whereas during the later Cenozoic nestedness generally increased. Insect feeding on gymnosperms showed one pattern of change with low turnover and high nestedness whereas a bimodal pattern characterized angiosperms. Ferns and angiosperms exhibited less DT functional breadth (host-plant “specificity” by herbivores) than gymnosperms, reflecting major differences in links between insect herbivores and their host-plants. This fundamental trophic shift is consistent with the Mid Mesozoic Parasitoid Revolution, implying top-down control of herbivores by their consumers rather than bottom-up regulation of food sources that shaped the modern herbivory pattern. These findings provide a data-rich account of the ecological origins of modern herbivory.
README: Data from beta diversity analysis of insect herbivory evolution
Description of the data and file structure
Dataset S1
The insect herbivory data collected from Paleozoic to Cenozoic plant assemblages is displayed in Dataset S1, titled "Data Collection from Paleozoic to Cenozoic
," in an Excel file. It includes 180 plant assemblages, though only 168 have data on Damage Types (DTs) and Functional Feeding Groups (FFGs). The richness of DTs assigned to each FFG is also provided.
1) For detailed information about each plant assemblage, the following is included:
- Assemblage represents the name of all plant assemblases or locality we can search before 2024, as referenced in the cited literature, such as the "Kimin", or "Iceland 9-8Ma";
- Abbreviation means the brief name of each plant assemblage, Since the NMDS figure shows the association between insect herbivory groups (FFGs) and plant assemblages, it uses abbreviated names for simplicity, as the figure cannot display all full names;
- Country means the modern place that those plant assemblage, as fossil assemblages are distributed globally and have shifted over time due to continental movement; and
- "Age", "Period" and "Time" means the geologic age of each plant assemblage, based on the International Chronostratigraphic Chart,
- "Source": typically provided in the citations
2) For detailed insect herbivory information of each plant assemblage, all data are classified by the Functional Feeding Group-Damage Type (FFG-DT) system (Conrad Labandeira et al., 2007, the reference can be found in the main text description).This includes:
- "leaves", means the plant specimens examined for each plant assemblage, and the ratio of the plant specimens damaged by insect is listed as
- "herbivorized specimens (%)", the insect herbivory usually contains:
- Damage Types (DT) and
- Functional Feeding Groups (FFG), the specific FFGs include
- "HF", Hole Feeding;
- "MF", Margin Feeding;
- "SK", Skeletonization;
- "SF", Surface Feeding;
- "OV", Oviposition;
- "PS", Piercing and Sucking;
- "MIN", Mining;
- "GAL", Galling;
- "SP", Seed Predation;
- "WB", Wood Boring;
- "PA", Pathogen. The
- "specilized DT" means the ratio of specialist DTs to the total DTs
- "Source": related citations are provided
Some values may not be available ("na")
- The designation (freq) represents the DTs’ frequency, and (herb) are the herbivorized specimens in the article, without total examined specimens providing, and the results that can be found in associated references. The italics 1 represent the present, not the DT richness, of that FFG in the plant assemblage, such as Fotan plant assemblage (Dong et al., 2019).
- Due to the updated, unpublished version of the Damage Guide, there have been minor changes in DT assignments, in which there has been removal of a DT from one functional feeding group and reassigned to another functional feeding group. Nevertheless, the data listed here shows little difference with the results from the raw data, such as FFG richness.
Dataset S2
Dataset S2 contains raw data for 134 plant assemblages (both fossil and modern floras), used in the analysis of DT trend, beta diversity, NMDS analysis, and specificity analysis (Dataset S3). The raw data includes "paleozoic", "mesozoic", "cenozoic" three folders, separately. Each CSV file corresponds to a plant assemblage listed in Dataset S1. For example, the "Daohugou" plant assemblage has rows for DT species, and columns for the plant hosts, with 0/1 values indicating the presence or absence of each DT. However, 53 plant assemblages from the Mesozoic Triassic Carnian stage in South Africa are excluded, but they are described in Labandeira et al., 2018.
Dataset S3
All related analyses results and figures are attached in Dataset S3. The dataset includes:
1) the "dtrich_age", a dot-line trend chart showing the richness of DTs and FFGs across 131 fossil plant assemblages and three modern plant assemblages. Each datapoint represents the frequency of DT richness for the DT totals plant assemblage (left column) and (right columns) for each of its 12 FFGs. The three focal, mid-Mesozoic plant assemblages of Daohugou (165 Ma), Dawangzhangzi (125 Ma), and Rose Creek (103 Ma) are indicated by arrows at right. The wide DT totals column at left defined by the raw data (open circles), mild lumping (open squares), full lumping (open diamonds), and overlap of all three symbols show as open polygons, representing analogous DTs for each plant assemblage. Horizontal lines in the DT totals column represent 95% confidence intervals. The narrow columns (black dots) from left to right, represent the DT richness for each FFG of margin feeding (MF), hole feeding (HF), skeletonization (SKE), and surface feeding (SF), oviposition (OVI), piercing and sucking (P&S), mining (MIN), galling (GAL), borings (BOR), seed predation (SP), and pathogens (PAT). The relative data is saved as data.frame, which the terms consistent with those in Dataset S1.
2) the "ternary", a three-phase diagram illustrating the null distribution and early versions of damage type comparisons among 131 fossil and modern plant assemblages across seven time intervals. A composite null distribution of pairwise comparisons of damage types seen in 131 fossil and three modern plant assemblages from pooled data at (A), subsequently partitioned into seven indicated time intervals at (C)–(I). At upper right at (B) are hypothetical examples of turnover and nestedness. In each grid, the two rows represent two host plants, the six columns represent six damage types; blue squares indicate the presence of a damage type and white squares indicate an absence. At right bottom of (B) the mean DT richness and minimum sample coverage is shown, whereby the color and size of each datapoint indicates the mean DT richness for two host plants from the same plant assemblage. The higher the coverage rate, the more widely distributed the DT is in the plant assemblage. The opacity of each datapoint indicates the minimum to maximum sample coverage; the basic sample coverage is 0.85–0.99 percent. All plant assemblages were “standardized” by the three categories of raw data (circles), the “mild lumping” (triangles), and “full lumping” (diamonds). All 134 plant assemblages consisted of leaves that reached the broadleaved standard of a sample coverage of 85% or higher or alternatively with specimen numbers greater than 20 for each plant host species. Average values for similarity (1-β), turnover β(Turnover), and nestedness β(Nestedness) are indicated for each time interval. The relative data is saved as ternary, with terms of turnover, Nestedness, and Ubeta etc., which were generated using the R package "betapart.", representing the diversity of insect herbivory.
3) the "compareNT", a bar chart comparing turnover and nestedness values across 134 plant assemblages, including resampled data for major plant hosts from fossil and modern plant assemblages. Mean values and 95% confidence intervals are indicated for beta diversity metrics, generated by resampling and subsampling of data for the major plant hosts from fossil and modern plant assemblages. Dark purple columns represent data resampled 1000 times and dark green columns represents data resampled 500 times. The solid horizontal line represents the raw data. Nine representive plant assemblages are show at main text*. *The relative data is saved as "compareNT", which also include terms of turnover, Nestedness, and Ubeta etc., and generated using the R package "betapart". The "NA" represents the data not available.
4) the "igs-age-type", a cumulative kurtosis chart showing the functional breadth of DTs across geological time, categorized by plant type (ferns, gymnosperms, angiosperms). At "type" , the left-top panel represents the damage-type functional breadth based on geological time of the Paleozoic (oldest), Mesozoic, and Cenozoic (youngest) eras. At "cmp", the left bottom panel represents damage-type functional breadth based on major plant taxa. At "type_cmp", the panel at right, represents damage-type functional breadth based on major plant taxa through the three geologic eras. In each panel the horizontal axis represents the logarithm of the DT frequency. Each dot represents the frequency of a DT on each plant host. Box plots represent the data distribution (including mean values and floating ranges). The higher the peak value, the more the data are concentrated along the X-axis. The relative data is saved as "igs-age-type", including the DT species, host, and specificity of each plant assemblage.
5) the "NMDS", Nonmetric multidimensional scaling (NMDS) plots showing the relationships of the major plant clades/groups and their functional feeding groups for plant assemblages from the late Paleozoic to Mesozoic in (A), and all plant assemblages from late Paleozoic, Mesozoic, Cenozoic, and modern in (B). Each data point represents the DT richness on foliage specimens that is represented with areal coverage higher than 0.85 percent or otherwise consisting of more than 20 specimens for a plant species. Pathogens, borings, and seed predation FFGs are not included, as these FFGs are not documented in all plant assemblages. Ellipse outlines are 84% confidence intervals. Abbreviations and details of all plant assemblages can be found in Dataset S1.
The analysis was conducted using R, and the R code is provided in a text file.
Sharing/Access information
Code/Software
The data analysis was conducted using R, with several key packages including "ternary," "betapart," "vegan," "dplyr," "purrr," "scico," "ggplot2," "gridExtra," and others. Several functions were used to calculate data coverage, as well as for data resampling and subsampling.
Methods
All the related raw data, figures and extended analyses are provided in the three datasets.