Skip to main content

Data from: DNA barcoding identifies cryptic animal tool materials

Cite this dataset

Neaves, Linda et al. (2021). Data from: DNA barcoding identifies cryptic animal tool materials [Dataset]. Dryad.


Some animals fashion tools and other constructions out of plant materials to aid foraging, reproduction, self-maintenance, and protection. The choice of raw materials can affect the structure and mechanical properties of the resulting artefacts, with significant fitness consequences. Documenting animals’ material preferences is challenging, however, as manufacture behaviour is often difficult to observe directly, and materials may be processed so heavily that they lack identifying features. Here, we use DNA barcoding techniques to identify, from just a few recovered tool specimens, the plant species New Caledonian crows (Corvus moneduloides) use for crafting elaborate hooked stick tools in one of our long-term study populations. The method succeeded where extensive fieldwork using conventional approaches had failed, including targeted observations, radio-tracking, bird-mounted video-cameras, and behavioural experiments with wild and temporarily captive subjects. We believe that DNA barcoding will prove useful for investigating many other tool and construction behaviours, helping to unlock significant research potential across a wide range of study systems.


In May 2018, seven crow tools collected at site-3, New Caledonia between 2016 and 2017 were subsampled, two subsamples were taken from each tool, and these were homogenized in 2 ml tubes with two tungsten beads using the FastPrep-24™ 5G Benchtop Homogenizer for up to one minute, or with QIAGEN Tissuelyser II for 6 minutes at 25 Hz. DNA extraction then proceeded according to the DNeasy Plant Mini Kit (QIAGEN, Manchester UK), except the lysis step was extended to 60 minutes.

For species identification, we amplified ~500 bp regions of the chloroplast trnL-UAA intron [trnLc-d] and a ~600 bp nuclear ribosomal region containing the internal transcribed spacer regions ITS1 and ITS2 [ITS5p-8p]. Amplification was in 20 µl reactions using approximately 100 ng of genomic DNA, 10 x reaction buffer (Bioline BIOTAQ Reagent Buffer), 30 nmol MgCl2, 4 nmol dNTPs, 7.5 pmol primers, and Bioline BIOTAQ DNA polymerase (0.5 units). Negative controls and DNA extraction blanks were included in each PCR to check for potential contamination. Thermocycling was performed on a Bio-Rad Tetrad 2 (Bio-Rad, Hamburg, Germany) under the following conditions. For trnL: initial denaturation (94 °C for 4 min), followed by 35 cycles of denaturation (94 °C for 45 s), annealing (55 °C for 45 s) and extension (72 °C for 120 s), and a final extension of 10 min at 72 °C; and for ITS: initial denaturation (94 °C for 2 min), followed by 30 cycles of denaturation (94 °C for 60 s), annealing (55 °C for 60 s) and extension (72 °C for 90 s), and a final extension of 5 min at 72 °C. Two independent PCRs were carried out for each subsample. Successfully amplified PCR products were cleaned using ExoSap-IT© (USB Corporation, Cleveland, Ohio, USA). Sequencing was resolved on an AB 3730xl Sequencer at Edinburgh Genomics (Scotland, UK). Sequences were checked and edited with reference to chromatograms using Sequencher v 5.4.1 (Gene Codes Corporation, Ann Arbor, MI, USA).

Sequences obtained for all tool samples were lodged with GenBank under accession numbers MT366813–MT366819 (trnL) and MT366951–MT366952, MT366955–MT366959 (ITS). The resulting sequence haplotype for each region was used for performing BLASTn searches against the National Centre for Biotechnology Information Nucleotide nonredundant database (4 July 2018) to obtain the best 100 matches (excluding environmental samples). 

Based on putative identification from the BLASTn searches, we also collected reference leaf samples of Mimusops elengi (n = 3) and its closest locally occurring relative, Planchonella cinerea (n = 2). Leaves were collected from separate trees at or near site-3 and site-1. In March 2019, DNA extraction, PCR amplification and sequencing for the reference samples were performed exactly as described above for the tool samples. The resulting DNA sequences for the reference data were lodged with GenBank under accession numbers MT366823–MT366824 (trnL) and MT366953–MT366954 (ITS) for P. cinerea, and MT366820–MT366822 (trnL) and MT366960–MT366962 (ITS) for M. elengi. Phylogenetic trees were generated using these samples plus the top 100 BLASTn matches (excluding environmental samples). Since taxa from both Sapotaceae and Theaceae were present in the top matches for trnL, Acanthogilia gloriosa (GenBank accession EU348374), a polemonioid Ericale was included as the outgroup for trnL. Since all the taxa in the top matches for ITS were from the Sapotaceae, subfamily Sapotoideae, Sarcosperma laurinum (GenBank accession AM408055), which has previously been shown to be sister to the rest of the family, was included as an outgroup for ITS.

Sequences were aligned using the MUSCLE algorithm. Maximum-likelihood (ML) trees were constructed using the .pml function in the package phangorn, implemented in R. The most appropriate model of DNA substitution was selected using the AIC and the function modelTest. We used the HKY model with the proportion of invariant sites and rate variation optimized using the function optim.pml. Support for the branching topology was evaluated with 1,000 bootstrap replicates. Separate trees were generated for chloroplast (trnL) and nuclear (ITS) regions.

Usage notes

See README file for usage. Data in Mesquite format and includes alignments and consensus tree. There is one Mesquite file per DNA barcode. All Genbank numbers are listed in the sequence headers and branch labels.


Leverhulme Trust, Award: RPG-2015-273

Biotechnology and Biological Sciences Research Council, Award: G023913/1

Biotechnology and Biological Sciences Research Council, Award: G023913/2