Data, code and data plots of the study reported in the article: Wolff JO, Wierucka K, Uhl G, Herberstein ME (2021). Building behaviour does not drive rates of phenotypic evolution in spiders. Proceedings of ther National Academy of Sciences 118 (33) e2102693118; https://doi.org/10.1073/pnas.2102693118 corresponding author: Jonas Wolff, Department of Biological Sciences, Macquarie University, Sydney, Australia; jonas.wolff@mq.edu.au 1. Brief description This data set contains raw data tables, scripts and supplemental figures supporting the article Wolff et al. (2021, PNAS 118: e2102693118). In our study we assembled morphometric and ecological trait data of spiders from literature and de novo measurements and observations and used this data to infer the rates of morphological change over deep time in relation to web building behaviour. 2. List of files deposited on Dryad: Dataset_ecological-data-raw.xlsx: Referenced raw data of ecological traits (incl. taxonomic accuracy of observations). Dataset_morphometric-data-raw.xlsx: Referenced raw data of morphometric measurements. Dataset_combined-trait-matrix.csv: Combined matrix of calculated morphological and interpreted ecological traits used for model calculation. deposited on Zenodo: ESM-code.zip: Input files and code to reproduce the analyses of the study. supplemental-figures.pdf: Additional plots of the results of the sensitivity analyses performed to test the robustness of the results towards differences in prior specifics, taxon sampling and trait inclusion (for details see methods and original article). 3. Methods Trait database We built a database of morphometric and ecological data on a representative taxon sample of the order Araneae. We followed the taxon sample of the Araneae Tree of Life project (AToL) 2, which contains 932 terminals of all at that time valid families except Synaphridae (corresponds to ~2% of described species). This sample is representative of the phylogenetic and morphological diversity of spiders. AToL terminals that were not identified to species level and for which no image material was available were replaced with described species with a type locality close to the collection site (26.3% of the used sample; details in main article (1)). 11% of the AToL terminals were omitted as there was not enough information to determine a suitable replacement, resulting in a total of 828 included species. The morphological data were assembled by extracting data from taxonomic descriptions using the WSC database (3), and measurements on images published in articles or online repositories (including Morphbank :: Biological Imaging, http://www.morphbank.net, where images of AToL specimens were deposited), with one to seven sources combined per species (for statistics of used sources see main article 1, and for a list see file “Dataset_morphometric-data-raw.xlxs”). As many spiders exhibit a significant sexual dimorphism, only data of adult females were used. We included only general traits, i.e., ones that were assumed to be affected by more than one niche property. For instance, body shape may be under selection from a mix of abiotic (e.g., temperature and microhabitat structure) and biotic (e.g., prey spectrum and predation) factors. The following measurements were recorded: body length; cephalothorax (prosoma) length; cephalothorax width; height of cephalothorax (carapace); length of mouth parts (i.e. cheliceral base segment); diameter of each eye type; length of front leg (excl. coxa, trochanter and pretarsus). From these the following six traits were calculated: (1) body size (=body length); (2) body shape (cephalothorax width / cephalothorax length); (3) relative cephalothorax height (cephalothorax height / (cephalothorax length + width)); (4) size of mouth parts (paturon length / cephalothorax height); (5) eye size (sum of diameters of all eye types / cephalothorax width); (6) relative leg length (length of front leg / cephalothorax width). From each trait the species mean was calculated (i.e., from the 1-7 data sources, for details see main article (1) and file “Dataset_morphometric-data-raw.xlsx”) and log-transformed, to build the species matrix for further analysis (file “Dataset_combined-trait-matrix.csv”). The ecological data matrix was built by assessing the literature on same or closely related species, and in few cases complemented by personal observations (for details, see notes in file “Dataset_ecological-data-raw.xlsx”). We used a binary coded category: state 0, non-builder; state 1, builder. We defined a species as a ‘builder’ (1), if individuals spend most of their life in a self-constructed web or burrow, i.e. foraging and reproduction takes place on, in or from the artefact, and the artefact aids in prey capture, signalling and/or defence. In contrast, a ‘non-builder’ (0) does not build a capture web or a burrow, it may build a retreat, which, however, is only used in periods of inactivity and does not aid in prey capture. Test of evolutionary hypothesis To infer state-dependent evolutionary rates the recent MuSSCRat (multiple state-specific rates of continuous-character evolution) approach was used 3. MuSSCRat is a reversible-jump Markov chain Monte Carlo (rjMCMC) approach to determine the likelihood of a model where a discrete (ecological) trait correlates with the evolutionary rates of one or more continuous traits vs. a model where no such relationship exists. In both models, it is assumed that there are alternative (background) effects on rate variation, which is implemented by partitioning the inferred global rates into a state-dependent and a background rate domain (for details, see Burress et al. (5)). This avoids the pitfall of traditional approaches where any difference in the evolutionary mode between two ecological groups is attributed to the trait state (known as the ‘straw-man argument’ problem (4)). The analysis was built in RevBayes (6), based on the code by Burress et al. (5), with slight modifications (scripts and input files are found in the file “ESM-code.zip”). As the phylogenetic model we used a time-calibrated ultrametric tree including all AToL terminals (7). The evolution of the discrete character followed a Mk model (8), with the rate parameter λ drawn from a log-uniform distribution (a=0.0001, b=1). For the two multivariate datasets a LKJ prior with eta η = 1.0 and dim = n (characters) was used for the partial correlation matrix. For the background-rates model the assumption that characters evolve at a constant rate was relaxed by applying a relaxed local clock prior, as described in Burress et al. (5). The prior for the expected number of rate changes was set equivalent to the number of transitions between the states of the discrete trait. This number was inferred a priori by ancestral character estimation in phytools using ER and ARD models with the stochastic character mapping approach with 100 iterations (9). To effectively sample between both the state-dependent and the state-independent model a weight of ten was set on the reversible jump proposal. Each of two chains were run over 500,000 generations for (a) a body size only dataset with 815 species, (b) a body size + body shape (as defined under 2.1 above) dataset with 749 species, and (c) a dataset containing all six traits (as defined under 2.1 above) for 340 species. Parameters were sampled every 10th generation. The combined log files of both chains were analyzed in Tracer 1.7.1 10 to assess convergence and summarize parameters, with the exclusion of a burn-in of 10%. The posterior probability of H1 was computed as the fraction of MCMC samples for which ζ(1) > ζ(0). The posterior probability of H2 was computed as the fraction of MCMC samples for which ζ(1) < ζ(0). Bayes factors (BF) were calculated by dividing the posterior odds (the ratio of posterior probabilities of the competing models) by the corresponding prior odds. BFs were interpreted, using the guidelines in Kass and Raftery (i.e., evidence against competing hypotheses BF > 3; strong evidence if BF > 20) (11). Branch-specific rates were visualized in RevGadgets (https://github.com/revbayes/RevGadgets). Sensitivity analysis To test the sensitivity of the analysis towards the prior estimated number of rate shifts, we ran additional tests using an expected number of shifts of 0.1 and 0.01 times the number of branches with chains of 100,000 generations each. The effects were similar as shown in Burress et al. (5), i.e. the state dependent evolutionary rates ζ, the posterior evolutionary rate at the root β2R, the rate of building behaviour λ and the number of state changes was consistent across different priors. There was an effect on the posterior number of rate changes, but this did not affect the locations of major rate shifts. The posterior probability of the different hypotheses H0, H1 and H2 was slightly affected by the prior on the expected number of rate shifts: H2 tended to have a stronger support for a very low prior number of rate shifts. In the case of the body length only and the 6-traits datasets, this had no effect on the significance category after Kass and Raftery (11). In the case of the 2-traits dataset, there was weak support for H2 (BF(H0) = 0.44; BF(H2) = 6.70) at a prior of 0.01 times the number of branches, while for a prior of 0.1 times the number of branches there was weak support for H0 (BF(H0) = 5.35; BF(H2) = 0.48). Notably, in all cases, there were only slight effects on the magnitude of state-dependent rates, the locations of major rate shifts were consistent throughout priors, and only a minor fraction of the rate variation was explained with state-dependent effects. We therefore conclude that our reported results are robust. Visual representations of the results from the sensitivity analyses can be found in the file “supplemental-figures.pdf”. Effect of data sample on the results We performed three separate analyses based on the three datasets differing in the number of species and traits. The results showed some effects of the data sample on the posterior probability of hypotheses H0 and H2 (as reported in main text and in the section above). Using the 6-traits dataset led to a higher support of H2 than using the either of the other two datasets that included more than double as many species but less traits. To test if the taxon sample of the 6-traits dataset is biased towards H2 we ran two additional chains over 100,000 generations with this dataset, but only including body length. Results barely differed from the results obtained with the larger body length-only dataset (analysis of body length evolution with taxon sample of 6-traits dataset: BF(H0) = 6.50; BF(H1+H2) = 0.15; BF(H1) = 0.32; BF(H2) = 0.11; with taxon sample of body length-only dataset: BF(H0) = 7.56; BF(H1+H2) = 0.13; BF(H1) = 0.10; BF(H2) = 0.28), indicating that the reduced taxon sample of the 6-traits dataset is not biased towards H2. This suggests that the observed differences in hypothesis support between datasets are explained by the different number of included traits rather than the different number of included taxa. We thus cannot rule out that H2 might become more likely for some specific combinations of traits not tested here. However, as for none of the tested cases (i.e., increasing either taxon sampling or trait number) there was a strong support for H1 or H2, we are confident that our conclusions are robust and can be generalized. We note the 6-traits dataset includes descriptors of all major body parts (cephalothorax, abdomen, legs, mouth parts, and eyes) and thus well describes spider gross morphology. We argue that if niche construction has a significant effect on phenotypic evolution, it should leave its trace in such general descriptors and not in very specific traits directly associated with the building behaviour (such as the size of silk glands). Comparison of the branch-specific, state-dependent, and global rates showed that if H2 is accepted, it only explains a very small fraction of rate variation. Rate shifts and changes (losses or gains) in building behaviour were almost uncorrelated. This was consistent across datasets. Method References 1 Wolff, J. O., Wierucka, K., Uhl, G., & Herberstein, M. E. Building behavior does not drive rates of phenotypic evolution in spiders. Proc Natl Acad Sci 118, e2102693118 (2021). 2 Wheeler, W. C. et al. The spider tree of life: phylogeny of Araneae based on target‐gene analyses from an extensive taxon sampling. Cladistics 33, 574-616 (2017). 3 Nentwig, W., Gloor, D. & Kropf, C. Taxonomic database: Spider taxonomists catch data on web. Nat Cell Biol 528, 479 (2015). 4 May, M. R. & Moore, B. R. A Bayesian Approach for Inferring the Impact of a Discrete Character on Rates of Continuous-Character Evolution in the Presence of Background-Rate Variation. Syst Biol 69, 530-544 (2020). 5 Burress, E. D., Martinez, C. M. & Wainwright, P. C. Decoupled jaws promote trophic diversity in cichlid fishes. Evolution 74, 950-961 (2020). 6 Höhna, S. et al. RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst Biol 65, 726-736 (2016). 7 Fernández, R. et al. Phylogenomics, diversification dynamics, and comparative transcriptomics across the spider tree of life. Curr Biol 28, 1489-1497 (2018). 8 Lewis, P. O. A likelihood approach to estimating phylogeny from discrete morphological character data. Syst Biol 50, 913-925 (2001). 9 Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol 3, 217-223 (2012). 10 Rambaut, A., Drummond, A. J., Xie, D., Baele, G. & Suchard, M. A. Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst Biol 67, 901 (2018). 11 Kass, R. E. & Raftery, A. E. Bayes factors. J Am Stat Assoc 90, 773-795 (1995). 4. Description of files Dataset_ecological-data-raw.xlsx: This spreadsheet contains the raw data of ecological traits of spider species (i.e. web building behaviour) with the corresponding references. terminal - the composed name used for further processing taxonomic reference (WSC LSID) - the World Spider Catalog (https://wsc.nmbe.ch/) unique identifier of the species (if identified). builder (state) - binary coding of web building behaviour (0, no capture web or permanent burrow; 1, capture web or permanent burrow; for details see methods) species information - an 'x' means this information is based on observations on this specific species genus information - an 'x' means this information is based on observations on representatives of this specific genus sub-family or tribe - an 'x' means this information is based on observations on representatives of this specific sub-family or tribe family information - an 'x' means this information is based on observations on representatives of this specific family Remarks - further remarks, e.g. on the accuracy of information or interpretation of silken structures as 'web' References - references for the information used to code 'builder' Dataset_morphometric-data-raw.xlsx: This spreadsheet contains the raw data of morphometric measurements. These were either taken from literature (see reference column) or determined from micrographs or macro-photos. For each species one or more records were included. Each row represents a record (i.e. one reference). Empty cells mean that the corresponding trait was not measured. terminal - the composed name used for further processing taxonomic reference (WSC LSID) - the World Spider Catalog (https://wsc.nmbe.ch/) unique identifier of the species (if identified). body_length - body length (excluding chelicerae and spinnerets) in mm; the mean if based on multiple individuals body_length_min - minimal body length for records based on multiple individuals body_length_max - maximal body length for records based on multiple individuals cephalothorax_length - length of cephalothorax (prosoma) in mm cephalothorax_width - width of cephalothorax (prosoma) in mm cephalothorax_height - height of cephalothorax (prosoma) in mm paturon_length - length of cheliceral base segment (paturon) in mm AME_diameter - diameter of anterior median eye in mm ALE_diameter - diameter of anterior lateral eye in mm PME_diameter - diameter of posterior median eye in mm PLE_diameter - diameter of posterior lateral eye in mm L1 - total length of anterior leg (front leg) in mm, excl. coxa, trochanter and pretarsus L1_femur - total length of the L1 femur between condyles in mm L1_patella - total length of the L1 patella between condyles in mm L1_tibia - total length of the L1 tibia between condyles in mm L1_metatarsus - total length of the L1 metatarsus between condyles in mm L1_tarsus - total length of the L1 tarsus between condyles in mm N - number of individuals measured (if reported) locality - locality where specimens were collected (if reported) References - reference from which trait information was retrieved notes - notes on trait data (e.g. if converted from different unit, trait calculated from other measurements or relative measurements were used) image_refs - author and year of images used for measurements Dataset_combined-trait-matrix.csv: This is the combined matrix of calculated and log-transformed morphological and interpreted ecological traits used as input for our analyses. terminal - the composed tip name used for further processing body_size - body length (log transformed) body_shape - cephalothorax width / cephalothorax length (log transformed) relative_cephalothorax_height - cephalothorax height / (cephalothorax length + width) (log transformed) size_of_mouth_parts - paturon length / cephalothorax height (log transformed) eye_size - sum of diameters of all eye types / cephalothorax width (log transformed) relative_leg_length - length of front leg / cephalothorax width (log transformed) builder - binary coding of web building behaviour (from the ecological data spreadsheet) deposited on Zenodo: ESM-code.zip: Input files and code to reproduce the analyses of the study. supplemental-figures.pdf: Additional plots of the results of the sensitivity analyses performed to test the robustness of the results towards differences in prior specifics, taxon sampling and trait inclusion (for details see methods and original article).