Data from: Spider venom potency exhibits phylogenetic prey-specificity but does not trade-off with body size or silk use in prey capture
Data files
May 07, 2025 version files 402.96 KB
-
README.md
11.96 KB
-
S1_main_dataset_25.csv
336.95 KB
-
S2_body_mass_length_dataset.csv
54.05 KB
Abstract
Spiders employ a diverse range of predator traits including potent venoms, complex silk hunting strategies and mechanical strength coupled with larger body sizes to capture prey. This trait diversity, along with the quantifiable nature of venom potency, makes spiders an excellent group to study evolutionary trade-offs. Yet, comparative approaches have been historically confounded by the use of atypical prey models to measure venom potency. Here, we account for such confounding issues by incorporating the phylogenetic similarity between a spider's diet and the species used to measure its venom potency. Using a phylogenetic comparative analysis of 75 spider species to test how diet, silk use in prey capture and body size drive venom yield and potency (LD50), we show that spider venoms are generally more potent against models more closely related to their natural prey, reflecting prey-specific patterns. We find that venom yield scales sublinearly with size, reflecting the 0.75 allometric scaling predicted by metabolic theory, suggesting venom is metabolically expensive in spiders. Our approach demonstrates how contemporary comparative approaches can be applied to historic venom potency measures to test fundamental evolutionary patterns in predator traits.
Dataset DOI: 10.5061/dryad.76hdr7t4j
Description of the data and file structure
The following abbreviations are used to refer to supplementary files:
S1 = Main dataset.
S2 = Spider body length and body mass dataset for calculating Mass = Length*2.6.
S3 = R studio script to reproduce both main and supplementary models.
S4 = R studio script and files to reproduce phylogenetic tree (adapted from Wolff et al., 2022).
S5 = Main and Supplementary model outputs (results).
S6 = Detailed Methodology and Data Descriptions
Files and variables
The following describes how the supplementary files are to be used in relation to this study:
There are two dataset files, Dataset S1 (main) and Dataset S2 (Supplementary).
Dataset S1 is to be used in conjunction with S3 R Studio script to reproduce the models and results for the tables and plots in the main manuscript. Dataset S1 is also to be used in conjunction with S4 R Studio script to reproduce the phylogenetic tree, along with other files (base phylogeny adapted from Wolff et al 2022) in the S4 folder.
Dataset S2 is to be used in conjunction with S3 R Studio script to reproduce supplementary model S1, the results of which were used to convert the mean spider length (mm) data found in dataset S1 to body mass (g) via the power law Mass = Length*2.6. See main manuscript or S6 file for more details.
File: S1_main_dataset_25.csv
Dataset S1 contains all the data for the main analyses including:
venom potency data (median lethal dose or LD50).
total spider body length (mm).
silk use in prey capture, broken down into two categories: “yes” or “no” and “silk primary”.
venom yield (milligram of lyophilized (dried) venom per spider or mg dried/spider).
LD50 model species injected and associated data (injection method and LD50 model mass (g)).
DLD50Diet [EVO_DIST], the mean phylogenetic distance between the LD50 model species and natural prey species, measured in hundreds of millions of years (Hundreds of Mya) and is weighted by the proportion of each prey item in the diet.
The associated references for the data collected.
Detailed column name descriptions for Dataset S1 can be found below. Also note, any blanks found in the S1 dataset are represented by "null".
Variables for dataset S1
- species = Spider species scientific name (genus + species).
- taxa = Spider family.
- reported_sp._name = Spider species name as reported from source.
- animal = One of several columns used for producing phylogeny.
- species_matched = One of several columns used for producing phylogeny.
- silk_yn = Silk use in prey capture, listing a species as a silk hunter (yes) or non-silk hunter (no).
- silk_use_ref = Reference for silk hunting strategy source.
- body_length_mm = Mean spider body length in mm.
- body_length_mm_min = Minimum spider body length in mm.
- body_length_mm_max = Maximum spider body length in mm.
- body_length_ref = Reference for spider body length source.
- venom_yield_mg_spider_dried = Venom yield measured in mg dried per spider.
- venom_yield_mg_spider_min = Minimum mg dried per spider.
- venom_yield_mg_spider_max = Maximum mg dried per spider.
- venom_extraction_method = Method used to obtain venom samples from spiders.
- dried_or_not_dried = If venom sample was lyophilised or not during the preparation process.
- venom_yield_and_extraction_ref = Reference for spider venom yield and extraction method source.
- venom_yield_ref_2 = Reference for Venom yield value collected for spiders where the LD50 study did not report the venom yield.
- ld50_raw = Raw venom yield reported (ul).
- ld50_raw_unit = Unit raw venom yield was reported as e.g. mg venom per kg of test subject.
- ld50_mgkg_converted = The converted LD50 recorded in mg/kg. This measure was used for LD50 analysis.
- ld50_mgkg_converted_min = Converted LD50 mgkg minimum.
- ld50_mgkg_converted_max = Converted LD50 mgkg maximum.
- ld50_measurement_error_type = Reported range of error type.
- ld50_method_summarised = LD50 injection method with 5 categories, 3 for vertebrates (iv, ip and sc) and 2 for invertebrates (t/c and abdomen).
- ld50_model = Injected model species scientific name.
- ld50_model_mass_g_mean = Mean mass for injected model species measured in grams.
- ld50_model_mass_g_max = Maximum mass for injected model species measured in grams.
- ld50_ref = Reference for spider LD50 and associated data source.
- EVO_DIST = Total evolutionary distance between injected test model and spider’s focal prey (i.e. DLD50Diet).
- number_classes_in_diet = The number of taxonomic classes reported in the diet of a given species (1 class = 1).
- diet_ref1-10 = References for quantitative and qualitative diet data sources.
- diet_inferred = Lists if spider diet was partially (part), fully (all) or not (no) inferred from other closely related species.
- number_of_species_diets_ = Lists number of species diet data was obtained from. For example, if one species had part of its diet inferred from two closely related species of which the diet is reported to be similar, the number of species diet data was obtained from would equal 3.
- quan_or_qual = Lists if diet data was “quantitative” only, “qualitative” only or includes “both”.
- diet_include_vert_yn = Factor for model s3, denoting if vertebrates were reported in a spider species diet or not (yes/no).
File: S2_body_mass_length_dataset.csv
Dataset S2 is to be used in conjunction with S3 R Studio script to reproduce supplementary model S1, containing data on spider body length (mm) and spider body mass (g).
Detailed column name descriptions for Dataset S2 can be found below. Also note, any blanks found in the S2 dataset are represented by "null".
Variables for dataset S2
- species = Spider species scientific name (genus + species).
- family = Spider family.
- genus_syn = Alternate spider genus name as reported from sources.
- species_syn = Alternate spider species name as reported from sources.
- body_mass_g_mean = Mean spider body mass measured in grams.
- body_mass_g_min = Minimum spider body mass measured in grams.
- body_mass_g_max = Maximum spider body mass measured in grams.
- measure = Denotes what type of measurement is reported, either a single observation or the mean of multiple individuals.
- sex = If females only, males only or both are included in the reported measurement.
- life_stage = If the reported measurement included adults only or different life stages.
- sample_size = The number of individuals measured.
- reference and reference_2 = References for body mass measurement sources.
- body_length_mm_mean = Mean spider body length measured in millimetres.
- body_length_mm_min = Minimum spider body length measured in millimetres.
- body_length_mm_max = Maximum spider body length measured in millimetres.
- measure = Denotes what type of measurement is reported, either a single observation or the mean of multiple individuals (body length).
- sex = If females only, males only or both are included in the reported measurement (body length).
- life_stage = If the reported measurement included adults only or different life stages (body length).
- sample_size = The number of individuals measured (body length).
- Reference – reference_7 = References for body length measurement sources.
There are two R Studio Script files, one to reproduce the results (S3) and another to reproduce the phylogeny (S4).
S3 R Studio script is to be used in conjunction with dataset files S1 and S2 to reproduce model results including tables and figures.
S4 R Studio script is to be used in conjunction with dataset S1 to reproduce the phylogeny used for the main analysis. In addition to dataset S1, the phylogenetic tree from Wolff et al., 2022 is also required and can be found within the S4 folder. The finished phylogeny used in the main analysis "Lyons_et_al_2025_Phylo.tre" is also found in the S4 folder so readers do not have to run S4 code to run S3 code if they so choose. Alternatively, a figure of the phylogeny can be found in file S6.
Step by step instructions to run both scripts can be found within the script files themselves.
There are two documents, one containing plots and tables of the results (S5) and another detailing the methodology in greater detail, including literature descriptions (S6).
The S5 document contains the results of all the models including main models 1 and 2 and supplementary models 1-5. The purpose of the document is to provide easy access to the all results in one place, so the reader does not have to run the S3 code files to check the results of these models.
The S6 document contains a detailed description of the methodologies used for the study and data descriptions, in the form of column heading descriptions. The purpose of this document is to provide more detailed explanations for the methodologies used for readers that want to know more about them.## Code/software
Code/Software
Both R Studio script files, S3 and S4, were run in R software version 4.4.**2 but can likely be run in newer versions as they release.
It is worth noting that a basic level understanding of how to use R Studio is required to run either code files. Instructions are provided for each step within the scripts but they may not always be clear to those unfamiliar with R. For S3 and S4, the main points to know are:
How to upload the datasets. The code is provided but some operations outside of the script are required. Instructions for how to do this are provided within the script. Make sure you know where the required files are located within your computer and that they are easily accessed.
How to adapt the script for your specific model runs. The script contains summaries of the results obtained during our model runs but every time a model is run the result numbers will differ slightly. For the plots in particular, you will need to change specific numbers included for the "abline" code to the numbers you obtained during your model run. Instructions for what numbers need to be changed (e.g. Intercept, body size slope) for which plots are provided within the script.
As mentioned previously, S3 R Studio script is to be used in conjunction with dataset files S1 and S2 to reproduce results including the tables and figures.
As mentioned previously, S4 R Studio script is to be used in conjunction with dataset S1 to reproduce the phylogeny used for the main analysis. In addition to dataset S1, the phylogenetic tree from Wolff et al., 2022 is also required and can be found within the S4 folder. The finished phylogeny used in the main analysis "Lyons_et_al_2025_Phylo.tre" is also found in the S4 folder so readers do not have to run S4 code to run S3 code if they so choose. A figure of the phylogeny can be found in file S6.
Access information
Other publicly accessible locations of the data:
- The body size, venom yield and LD50 data can be found in the World Spider Trait database (https://doi.org/10.57758/4A6E-QJ04).
Data was derived from the following sources:
- All data was sourced from the scientific literature through the search engines "Google Scholar" and "Web of Science". Some data was also obtained from the "World Spider Trait Database". However, all references for each datapoint can be found within the respective "ref" column for the associated trait, in both datasets S1 and S2.
Any questions you have about the supplementary files that are not answered here can be sent to the corresponding author at k.lyons7@universityofgalway.ie or keith.lyons.uog@gmail.com
To test our hypotheses, we collated data on venom potency, body size, silk use in prey capture (yes/no), venom yield, LD50 model species and natural diet from literature sources using the Web of Science search engine using the key words “LD50” “Venom”, “Spider” “Arachnid” “Yield” and following key references and databases such as The World Spider Trait database. All data was stored and organised in Microsoft Excel (S1-S2).
The data has been processed prior to analysis. See Supplementary document S6 for a detailed description of both the methodology used and data descriptions, in the form of column heading descriptions for both dataset files, S1 (Main dataset) and S2 (length to mass conversion data).
To reproduce the results and phylogeny, see S3 and S4 (knowledge of R coding language required). To see full model outputs (results) in the form of tables and figures, see S5.
To recap: S1 and S2 are the datasets, S1 being the main dataset and S2 being separate data used to convert spider body size to spider body mass; S3 is the R script to reproduce the results; S4 is the R script and files required to reproduce the phylogeny; S5 is a word document containing all analyses outputs (results) for both main and supplementary analyses; and S6 is a word document explaining the methodology used in greater detail and data descriptions.