All the files included are in ‘fasta’ format (.fas). Their names (Family_Locus_Publication.fas) refer to the family studied, name of the locus and the publication in which the sequences were issued. Each file contains already published sequences retrieved from GenBank. These files were used to compute percentages of successful assignment to species using the software BRONX (Little, 2011). For each family two files are provided, one including plastid sequences (trnH-psbA, psbK-psbI, trnT-trnL, psbB-T-N, rpoB, matK, or rbcL) and the second one including nuclear sequences (usually ITS, sometimes ITS2). In all cases, the two files include sequences obtained on the same individuals for the two loci and only individuals identified by a voucher in GenBank. In most cases, it means that the two files contain the same number of sequences. In some cases, several sequences were obtained on several individuals for the nuclear locus and the number of sequences therefore differs between the two files. Little DP (2011) DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. PLoS One, 6(8), e20552.