Skip to main content

Data from: Pitfalls during in silico prediction of primer specificity for eDNA surveillance

Cite this dataset

So, Ken Ying Kin (2020). Data from: Pitfalls during in silico prediction of primer specificity for eDNA surveillance [Dataset]. Dryad.


While high efficiency and cost-effectiveness are two merits of environmental DNA (eDNA) techniques for detecting aquatic organisms, the difficulty of designing species-specific primers can result in significant expenditure of time and money. During the in silico stage of primer development, primer specificity is predicted with alignment techniques such as BLAST that are based on the number and position of the primer/non-target template mismatches. However, we speculate that non-specific amplification is influenced by additional parameters, which lead to inaccuracies of in silico prediction. We performed in vitro specificity tests for 38 species-specific primers selected for seven fishes and six turtles, using singleplex conventional PCR (cPCR). A subset of 12 primer pairs were further tested with SYBR Green-based or TaqMan-based singleplex quantitative PCR (qPCR). We disentangle the relative importance of mismatch properties (types and positions), primer properties (length, GC content, and 3’ end stability), PCR conditions (template concentrations and annealing temperatures), and PCR technique (cPCR, TaqMan-based or SYBR Green-based qPCR) in determining the occurrence of amplifications. We then compared the PCR outcomes with the specificity check under two stringency scenarios based on alignment (i.e. BLAST search). We conducted a total of 679 cPCR and 226 qPCR analyses, with 90% of the reactions tested with non-target templates. Primer pairs predicted by Primer-BLAST to be specific rarely showed such specificity during the in vitro testing. BLAST searches correctly predicted the outcomes of around 67% of cPCR and qPCR, but had low sensitivity in detection of non-target amplification (29 – 57%). Primer specificity increased significantly with total number of mismatches and annealing temperature, but decreased with higher GC content in the primer sequence. Mismatches that consisted of A-A, G-A, and C-C pairings exerted 56% stronger reduction of non-specific amplification effects than other mismatches. To conclude, we show that prediction of primer specificity based only on the number and position of mismatches can be misleading. Our findings can be applied to increase the efficiency of the in silico primer selection process to maintain the relatively high efficiency and cost effectiveness of eDNA techniques.


For the details of the methodology, please refer to So et al. (2020) 

Usage notes

The data is the additional supporting information for the Manuscript "Pitfalls during in silico prediction of primer specificity for eDNA surveillance" published in Ecosphere by So et al. (in press). The dataset contains three excel files: (1) Table S1 summarizes the genetic sequences retrieved from GenBank used for primer development; (2) Table S2 provide the overview of the number of independent replicates for each species during the in vitro cPCR and qPCR validation on primer specificity in two separate spreadsheets; and (3) Table S3 provides the results of the in vitro primer specificity validation using cPCR (n =679) and qPCR (n = 226) and the corresponding Primer-BLAST prediction.

Main manuscript:

So Y.K.K., Fong J.J., Lam I.P.Y. & Dudgeon D. (accepted). Pitfalls during in silico prediction of primer specificity for eDNA surveillance. Ecosphere.