The role of evolving niche choice in herbivore adaptation to host plants: Literature survey data and R scripts for simulations
Data files
Dec 21, 2024 version files 110.74 KB
-
Choice.csv
6.12 KB
-
Fecundity.csv
14.04 KB
-
Full.csv
55.32 KB
-
Genetics.csv
5.84 KB
-
Mating.csv
1.40 KB
-
README.md
10.40 KB
-
Resistance.csv
16.30 KB
-
Species.csv
1.33 KB
Abstract
Individuals living in heterogeneous environments often choose microenvironments that provide benefits to their fitness. Theory predicts that such niche choice can promote rapid adaptation to novel environments and help maintain genetic diversity. An open question of large applied importance is how niche choice and niche choice evolution affect the evolution of insecticide resistance in phytophagous insects. We, therefore, developed an individual-based model based on phytophagous insects to examine the evolution of insecticide resistance and niche choice via oviposition preferences. To find biologically realistic parameter ranges, we performed an empirical literature survey on insecticide resistance in major agricultural pests and also conducted a density-dependent survival experiment using potato beetles. We find that, in comparison to a scenario where individuals randomly oviposit eggs on toxic or non-toxic plants, the evolution of niche choice generally leads to slower evolution of resistance and facilitates the coexistence of different phenotypes.
Our simulations also reveal that recombination rate and dominance effects can influence the evolution of both niche choice and resistance. Thus, this study provides new insights into the effects of niche choice on resistance evolution and highlights the need for more studies on the genetic basis of resistance and choice.
README: The role of evolving niche choice in herbivore adaptation to host plants: Literature survey data and R scripts for simulations
https://doi.org/10.5061/dryad.gtht76hwx
Description of the raw data files: Prepared by Alitha Edison, 07/08/2023.
Files and variables
**Resistance.csv **with variables below
T_Mortliaty_Mean: Average mortality of the population on toxic plants
C_Mortliaty_Mean: Average mortality of the population on nontoxic plants
Resistance: Whether it is a resistant (R), susceptible(S) or a cross (RS, SR, …) population
PaperID: First one or two digits denote the species, followed by year and last name of first author
**Choice.csv **with variables below
CI: Choice index = (Number of eggs on toxic plants-Number of eggs on nontoxic plants)/(Number of eggs on toxic plants+Number of eggs on nontoxic plants)
T-eggs: Number of eggs on toxic plants
C-eggs: Number of eggs on nontoxic plants
PaperID: First one or two digits denote the species, followed by year and last name of first author
**Mating.csv **with variable below
Same#: Number of mates chosen from the same host
Other#: Number of mates chosen from the other host
Total: Sum of Same# and Other#
Same_prop: Same#/Total
Same#: Number of mates chosen from the same host
Resistance: Whether it is a resistant (R) or susceptible(S) population
PaperID: First one or two digits denote the species, followed by year and last name of first author
**Genetics.csv **with variables below
Res_domin_coeff: dominance coefficient of resistance
n_Res_loci: number of resistance loci
Freq_Res_alleles: frequency of resistance alleles
PaperID: First one or two digits denote the species, followed by year and last name of first author
**Fecundity.csv **with variables below
Eggs/female: eggs laid per female
Resistance: Whether it is a resistant (R) or susceptible(S) population
Hatching rate: rate of eggs hatched, wherever it was reported
Plant: Whether the eggs laid per female was tested on a toxic (T) or nontoxic (C) plant
Time (days): Total duration of time when eggs were laid
Number of individuals: Total number of individuals in the arena when eggs were collected to calculate eggs/female where it was reported
PaperID: First one or two digits denote the species, followed by year and last name of first author
Full.csv
Gives DOIs of all articles screened, whether data was extracted, number of data points extracted for each sheet, and reason for exclusion
PaperID: First one or two digits denote the species, followed by year and last name of first author
Other variables include: R, C, M, G, and F.
These denote each of the csv file names i.e.- Resistance (R), Choice (C), Mating (M), Genetics (G) and Fecundity (F). The numbers in these columns provide the number of data points extracted for each csv file from each article.
Species.csv
Gives the list of species
SpeciesRank: code of the species; from 1 to 10
Empty cells
The empty cells in the data columns represent missing values due to various reasons, such as not available or not applicable. These cells are not filled with NAs or as NULL as these may interfere with the analysis scripts.
=====================================================================================================================
Description of the data and file structure for producing results/figures in the manuscript, Prepared by Peter Nabutanyi.
We provide the R scripts for simulating and analysing the individual-based model "The role of
evolving niche choice in herbivore adaptation to host plants". We also provide R scripts for
analysing the metadata, showing how parameter distributions were for model simulations.
To run any of the files, FIRST CREATE A FOLDER NAMED "PLANTHERBIVORECHOICE" And set it as your working folder/directory.
In your working directory/folder, place the data files and R scripts below:
Files and variables
Data files required include:
Choice.csv
Resistance.csv
Mating.csv
Genetics.csv
Fecundity.csv
Full.csv
Description: These are raw csv data files.
Variables
- See description above
Code/software
The R files include:
"IBM_PlantHerbivore_ParameterSamplingSpeciesAnalysis_SV.R",
"IBM_PlantHerbivore_ParameterSamplingSpeciesExecute_SV.R",
"IBM_PlantHerbivore_ParameterSamplingSpeciesPlotting_SV.R"
"IBM_PlantHerbivore_SimulationAnalysis_SV.R",
"IBM_PlantHerbivore_SimulationExecution_SV.R",
"IBM_PlantHerbivore_SimulationPlotting_SV.R",
"IBM_PlantHerbivore_SV.R",
"MetaAnalysisData_EachSpecies_SVe.R", and
"Density_Dependent_Survival_Fitting_SV.R".
The script "MetaAnalysisData_EachSpecies_SV.R" extracts data from the csv files, analyses the extracted data, and produces plots used in the manuscript. It also produces and saves (as CSV files) parameter distributions used during the sampling simulations. Therefore, THIS SCRIPT
MUST BE BEFORE THE "IBM_PlantHerbivore_ParameterSamplingSpeciesExecute_SV.R" SCRIPT. The density-dependent survival function is obtained from the "Density_Dependent_Survival_Fitting_SV.R" script. This script contains all the data and the code for fitting. Note that there could be warning/error messages when producing figures from this script.
It is OK as some plots in the middle of a for-loop have no data points and we avoided manually excluding such. So, no worries about the warning messages as the code will still run to complete the loop.
The main function for the individual-based model is given in the R script
"IBM_PlantHerbivore_SV.R". This script is executed using
"IBM_PlantHerbivore_SimulationExecution_SV.R" and
"IBM_PlantHerbivore_ParameterSamplingSpeciesExecute_SV.R" scripts for parameter exploration and parameter distribution sampling simulations respectively.
The data produced by "IBM_PlantHerbivore_SimulationExecution_SV.R" is analysed using the "IBM_PlantHerbivore_SimulationAnalysis_SV.R" script. After the analysis, the plots are obtained by running the "IBM_PlantHerbivore_SimulationPlotting_SV.R" script.
Similarly, for the empirical data sampling, we use the scripts
"IBM_PlantHerbivore_ParameterSamplingSpeciesExecute_SV.R",
"IBM_PlantHerbivore_ParameterSamplingSpeciesAnalysis_SV.R", and
"IBM_PlantHerbivore_ParameterSamplingSpeciesPlotting_SV.R" for data simulation, data
analysis, and plotting respectively. The scripts should be run in this order: execution, analysis, and plotting.
Note, however, that to get the figures in the manuscript, one has to pick plots produced using specified parameters and combine the single plots to produce the whole figure supplied in the manuscript.
It is easy to isolate required plots. For example, for scenarios where we explore the effect of one parameter while keeping others constant, one has to note down the index for default parameters, and the index for the focal parameter of interest is 0 (the one on the x-axis). Form the search string and search in the specific plot folder. For means, the name string contains "mean" or "var" for variance. All in all, one has to pay attention to the name strings used when running the plotting scripts. Note also that, for a given parameter combination, default parameters are already fixed within the loops. Within the scripts, all the parameters are described in the comments. The denotation/symbols of parameters in the scripts differ from those used in the manuscript partly because they are generic functions within R. Therefore, we strongly advise reading through the parameter description provided within the script to determine the right parameter combinations. Also, note that other parameters not used in our manuscript are clearly identified via comments.
Note that the number of simulations/parameter sets indicated in the scripts is much lower than that used to produce results. We used 1000 replicates under parameter exploration simulations and 500 parameter sets with 100 replicates per parameter set for the parameter distribution sampling scenarios. However, we have indicated only 5 replicates/parameter sets in all cases. We also give the actual values in the comment within the R script. Also, in the plot scripts, the axes limits and legends may have to be manually changed to suit the specific parameter combination(s). Other parameters are indicated as used in the production of the actual results.
WARNING! The actual parameters used to produce the figures in the manuscript may need a lot of time, ranging from hours to weeks even when using a node on cluster with 20 cores! All the execution scripts are parallelized to run on multiple cores. So, it is more suitable to use a computer
or cluster with more cores. For replicates, we either used 1000 or 500, as indicated above. To avoid long waiting times on cluster for a node of 20 cores, these replicates were divided during the actual simulations and later combined. In addition, different parameter combinations were also run
separately. In the execution codes provided, we assume all replicates will be run in one script. You are advised to run one parameter combination at a time and not all in one script. This recommendation applies specifically to the "IBM_PlantHerbivore_SimulationExecution_SV.R" script. Similarly, simulations for parameter distribution sampling can be split such that one does not run all the 500 parameter sets/100 replicates or 11 species in the for loop but runs say 50 parameter sets/10 replicates or 2 species at a time and later combines the data. This comment applies to the "IBM_PlantHerbivore_ParameterSamplingSpeciesExecute_SV.R" script. In addition, one can reduce the length of the parameter vectors so that fewer parameter values are run. However, you have to adjust the indices of default parameters supplied within the for loops accordingly.
These scripts were run on a Linux computing cluster (Ubuntu 22.04.3). The produced data was analysed in the Linux operating system (Release: 22.04 ), with R version 4.3.2 (2023-10-31), and also tested on the Ubuntu 20.04 release.
Access information
Other publicly accessible locations of the data:
- It is also provided directly as supplementary material to the journal.
Data was derived from the following sources:
- Literature survey from the Arthropod Pesticide Resistance Database.
Methods
The data was obtained from the Arthropod Pesticide Resistance Database where a list of the top 10 agricultural pests with the most cases of pesticide resistance and ranked them according to the number of compounds they were reported to be resistant against.
These are R scripts used during the simulation.