Skip to main content

Deciphering the diet of a wandering spider (Phoneutria boliviensis; Araneae: Ctenidae) by DNA metabarcoding of gut contents

Cite this dataset

Prada Quiroqa, Carlos Fernando (2022). Deciphering the diet of a wandering spider (Phoneutria boliviensis; Araneae: Ctenidae) by DNA metabarcoding of gut contents [Dataset]. Dryad.


Arachnids are the most abundant land predators. Despite the importance of their functional roles as predators and the necessity to understand their diet for conservation, the trophic ecology of many arachnid species has not been sufficiently studied. In the case of the wandering spider, Phoneutria boliviensis F. O. Pickard-Cambridge, 1897, only field and laboratory observational studies on their diet exist. By using a DNA metabarcoding approach, we compared the prey found in the gut content of males and females from three distant Colombian populations of P. boliviensis. By DNA metabarcoding of the cytochrome c oxidase subunit I (COI), we detected and identified 234 prey items (individual captured by the spider) belonging to 96 operational taxonomic units (OTUs), as prey for this wandering predator. Our results broaden the known diet of P. boliviensis with at least 75 prey taxa not previously registered in fieldwork or laboratory experimental trials. These results suggest that P. boliviensis feeds predominantly on invertebrates (Diptera, Lepidoptera, Coleoptera and Orthoptera) and opportunistically on small squamates. Intersex and interpopulation differences were also observed. Assuming that prey preference does not vary between populations, these differences are likely associated with a higher local prey availability. Finally, we suggest that DNA metabarcoding can be used for evaluating subtle differences in the diet of distinct populations of P. boliviensis, particularly when predation records in the field cannot be established or quantified using direct observation.


    1. Collection and locations

Sixty adult specimens of P. boliviensis were used for DNA metabarcoding of the entire gut contents (Figure 1). From each of three Colombian localities, we used twenty individuals (ten females and ten males per locality). These localities are separated by approximately 300 km: Barbosa (Antioquia; 6°40' 54.7''N, 75°41' 10.4''W), Oporapa (Huila; 2°01'40.5''N, 75°59'43''W) and Ibagué (Tolima; 4°32'22.3"N, 75°05'37.1"W). Spiders were collected in July (Barbosa), August (Oporapa) and September (Ibague) 2019. Locations were selected based on the previous distribution reports and accessibility for the species in Colombia

For each individual, elevation, temperature, relative humidity and mass (g) was recorded (Supplementary Table S1). At each locality, in addition to the 20 specimens, three more individuals were taken for standardization of the DNA metabarcoding technique. After being euthanized by freezing, the collected individuals were stored in 96% alcohol in separate Falcon centrifuge tubes and transported to the Biology Laboratory of the University of Ibagué (Ibagué, Colombia). Subsequently, 96% alcohol washes were performed to remove impurities. Later, the distal parts of the legs (tarsus and metatarsus) were removed.


  1. 2. Preliminary assays and sample processing

The development of the blocking primers was carried out following the protocol established by Lafage et al., (2020). Mitochondrial cytochrome c oxidase subunit 1 (COI) sequences for 230 entries of Ctenidae spiders were downloaded from BOLD database ( and clustered using the 'PrimerMiner' package v0.18 (Elbrecht & Leese, 2017). Sequences were aligned in Geneious 8.1.7 (Kearse et al., 2012) using MAFFT v7.017 (Katoh, Misawa, Kuma, & Miyata, 2002). PrimerMiner’s “selectivetrim” function was used to dgHCO and mlCOIntF (Leray et al., 2013) binding sites and the alignment for each group was visualized with PrimerMiner to visually identify suitable primer binding sites. Sites conserved among target spider prey taxa (Hexapoda) but differing in Phoneutria sequences were selected. The sequence designs of the blocking primers were as follows:


Since dissection of the highly diverticulated gut is difficult, we performed three preliminary assays to check which portion of the body would contain the greatest proportion of prey DNA (prosoma, prosoma+opisthosoma, or the entire individual except tarsus and metatarsus). For each assay, three spiders – one from each sampling location – were used. Each sample was individually homogenized and DNA was extracted following standard protocols: DNA extraction of tissues was performed using the Qiagen DNEasy Tissue kit (Qiagen, Hilden, Germany) under manufacturer's conditions. PCR reactions were carried out in 25 µL reaction volumes containing 2 µL of DNA extract with equal DNA concentrations (39 ng/µL), 12.5 µL of MyTaq mastermix (Biolone, Memphis, Tennessee) and 2.5 µM of each primer (dgHCO and mlCOIntF primers (Leray et al., 2013) to amplify the COI region. Four different annealing temperatures were tested (40, 40.3, 40.9 and 48 °C) in the preliminary assays which were previously shown to work well for COI  (Lafage et al., 2020)The optimum temperature being determined as 48°C (Figure S1). Thermocycler conditions were: initial denaturation at 95°C for 15 min; 30 cycles of 30 sec at 94°C, 90 sec at 48°C and 90 sec at 72°C; and a final extension for 10 min at 72°C. Positive amplifications were confirmed by visual inspection of PCR products in 2% agarose gels. PCR products were purified using ExoSAP-IT™ PCR Product Cleanup Reagent (Thermo Fisher Scientific, Massachusetts, US). DNA concentration of the cleaned PCR products was determined using a Qubit fluorometer (Thermo Fisher, Massachusetts, US). Purified PCR products (positive samples with dgHCO/mlCOIntF primers) were then Sanger sequenced and the resulting sequences processed using the sangeranalyse R package (v. 0.1) ( After an initial PCR with Illumina adapted primers (Lafage et al., 2020), we performed a second PCR with Illumina Nextera Indices and defined DNA concentrations of amplicons using Qubit. Then, amplicons were pooled in equimolar volumes (100ng each). Resulting libraries were sequenced on an Illumina MiSeq (v3 chemistry 2x300bp cycle kit with 5% PhiX spike in) carried out by AIM (Advanced Identification Methods GmbH Munich, Germany) following standard protocols (Kress & Erickson, 2007; Sang, Crawford, & Stuessy, 1997).


       3. DNA metabarcoding diet analysis

Preliminary analyses with the blocking primers identified that the optimal annealing temperature was 48°C (Figure S1). We also found that the Prosoma+Opisthosoma region contained the highest relative abundance of prey sequences of the three body regions tested. We used these conditions for the processing of the 60 P. boliviensis samples from the three sampled locations for metabarcoding.

Based on the results of our preliminary assay, we extracted DNA from the prosoma and opisthosoma of each specimen. The metabarcoding of the P. boliviensis samples was performed independently for each of the 60 samples of the study with the Illumina platform using the noSPI/dgHCO1 blocking primers designed during the preliminary assays. The raw data from the sequencing via Illumina was processed firstly merging paired-end reads, this step was made with -fastq_mergepairs (default settings), then cutadapt 1.18  was used to remove tags and primers with default settings (Kechin, Boyarskikh, Kel, & Filipenko, 2017) using Python 2.7.15 to obtain the filtered reads. Sequences with a length of less than 300pb were eliminated using FastQC version 0.11.8 and VSEARCH 2.9.1 (de Sena Brandine & Smith, 2019). In addition, singleton and chimera sequences were filtered using VSEARCH 2.9.1 (Rognes, Flouri, Nichols, Quince, & Mahé, 2016) at maximum expected error = 1, to generate the final FASTQ files by sample, following the protocols proposed by Leidenfrost et al. (2020) and Liu et al. (2020) (Leidenfrost et al., 2020; Liu, Clarke, Baker, Jordan, & Burridge, 2020). The VSEARCH 2.9.1 program was used to dereplicate, clustering and assign the sequences to operative taxonomic units (OTUs) with 98% identity as the threshold in FASTA files. All sequences were then matched against the OTUs to create a consensus OTU table using usearch_global. Of a total of 2.410.269 initial reads, after quality filtering 358.054 pair-end reads were obtained.

Sequences were blasted against the complete sequence database of the Barcode of Life Data systems (BOLD) in order to find the closest matches using the BOLD Identification Engine ( (Ratnasingham & Hebert, 2007). Taxon nomenclature follows the catalogue used in the BOLD and NCBI databases (accessed on March 2020). when conflicting taxonomic assignments appeared in the database we took the lowest non-conflicting taxonomic level indicated by the BOLD search (Federhen, 2012). Based on the FASTQ files for each individual, 256 OTUs were identified.

Based on these OTUs, different filters were applied according to standard exclusion criteria for this technique (Deagle et al., 2019; Lafage et al., 2020). Sequences with the following characteristics were eliminated: a) all reads representing fewer than 0.01% of the total number of reads per sample, b) sequences that corresponded to environmental DNA or intestinal microbiota, and c) OTUs matching prey for genus/family level data through a search in both BOLD and from GenBank, NCBI databases that did not correspond to the geographical distribution of P. boliviensis. d) Once these filters were applied, the OTUs with identity at ≥ 97% to those sequences were identified at the taxonomic level of species, those with ≥ 95% at the genus level, those with ≥ 90% at the family level, and sequences with ≥ 75% at the order level. Additionally, BINs (Barcode Index Numbers) were used to identify sequence clusters within the database, correlating with species in 98% of all cases (Lafage et al., 2020). After OTU filtering, a total of 105.583 sequences were retained, corresponding to 96 OTUs. In order to not underestimate the total reads, the filters were applied for each sample (percentage of reads per sample). Number of reads in sixty individuals is summarized in Table S3.

Usage notes

This data set include tables and figures to the publication and the raw data.