Skip to main content

Data and code for: Terrestrial eDNA survey outperforms conventional approach for detecting an invasive pest insect within an agricultural ecosystem

Cite this dataset

Allen, Michael; Nielsen, Anne; Peterson, Donnie; Lockwood, Julie (2021). Data and code for: Terrestrial eDNA survey outperforms conventional approach for detecting an invasive pest insect within an agricultural ecosystem [Dataset]. Dryad.


Recent methodological advances permit surveys for terrestrial insects from the direct collection of environmental DNA (eDNA) deposited on vegetation or other surfaces. However, in contrast to well-studied aquatic applications, little is known about how detection rates for such terrestrial eDNA-based surveys compare with conventional survey methods. Lycorma delicatula, the spotted lanternfly, is an emerging invasive insect in eastern North America, and a significant ecological and economic pest of forested and agricultural systems, especially grapes. During fall 2019 we conducted two rounds of paired eDNA and visual surveys for spotted lanternflies within 48 plots at 12 vineyards in New Jersey, USA. We compared detection probabilities within a multimethod occupancy modeling framework and used the results to extrapolate and inform survey design. The probability of detecting spotted lanternflies given presence in a plot was over two times higher for eDNA (84%) versus visual surveys (36%). In mid-September, lanternfly eDNA was detected at five plots in three vineyards while visual surveys revealed only a single individual in one plot. In early October, after dispersal of lanternflies into vineyards, lanternfly eDNA was detected in 12 plots within six vineyards compared with visual detections in six plots in two vineyards. Extrapolations based on detection and local-scale occupancy rates indicate that only five and 12 plots would have been needed to positively detect lanternfly presence with 95% confidence using eDNA in contrast to 14 and 29 plots with visual surveys alone, respective to survey rounds. Log-linear models revealed that visual counts of lanternflies were positively related to eDNA concentrations (R2 = 71%). We provide some of the first quantitative evidence to support the enhanced sensitivity of terrestrial eDNA approaches compared with conventional methods. Such methods can augment efforts to combat invasive species through improved ability to delimit invasion fronts, identify satellite populations, and confirm local eradications.


The following data collection and processing methods are reproduced from the accompanying manuscript (EDN3-2020-0099.R2):

Allen, M. C., Nielsen, A. L., Peterson, D. L., Lockwood J. L. (2021). Terrestrial eDNA survey outperforms conventional approach for detecting an invasive pest insect within an agricultural ecosystem. Environmental DNA.

eDNA and visual surveys

We performed paired eDNA and visual sampling for spotted lanternflies at four plots in each of 12 vineyards in New Jersey, USA, for a total 48 plots (Table 1). Our sites spanned the active spotted lanternfly invasion front along the Pennsylvania border (latitudes: 39.540 to 40.616; longitudes: -74.536 to -75.411). The area of grapevines planted with the vineyards ranged from ~1–15 ha with a mean of ~5 ha. We sampled each plot twice: once during round one, 9–18 September; and once during round two, 30 September–14 October, with the exception of two plots that were only sampled once, during round two, for logistical reasons. Plots were located at least 50 m apart and chosen to maximize chances of lanternfly detection using a priori knowledge of likely congregation sites. Specifically, we prioritized plots near forest edges with host trees, near previous lanternfly detections, and near vineyard parking lots where hitchhiking individuals were most likely to be present. Each plot consisted of 12–24 grapevines, either all in the same row or split between adjacent rows. The exact number of vines per plot depended on their relative size, with 12 used for mature vines (~2 m wide) and up to 24 for younger vines (~1 m wide). This approach ensured all sampled plots contained approximately the same surface area of vegetation.

Spotted lanternfly eDNA was sampled via spray aggregation (Valentin et al., 2020a). This involved using backpack-mounted (15-L) or handheld (5-L) manually-pressurized sprayers filled with deionized water to spray foliage and horizontal stems (cordons) while collecting the resulting dripped rinse water into clean 8-L plastic buckets. All buckets and lids were sterilized with 10% bleach solution and rinsed at least three times with deionized water in the lab prior to deployment in the field to remove any residual traces of lanternfly DNA. We moved through each plot at a rate of ~2 m/min, spraying foliage and cordons as evenly and completely as possible at a height of 0.25–1.5 m while collecting rinse water. Sampling took place on one side of the vines only. While spraying, we also carefully inspected the vines visually, looking along stems and the upper and lower leaf surfaces for adult lanternflies or egg masses, and counting observed individuals within each plot. Environmental DNA from honeydew of related insects in the order Hemiptera can persist outdoors on foliage for ~4–8 d in the absence of rain, with a degradation rate of ~3%/hr (Valentin, Kyle, Allen, Welbourne, & Lockwood, 2020b). Therefore, unlike visual sampling, eDNA sampling was surveying for presence at the time of the survey and also up to a week prior. Honeydew is only produced during lanternfly feeding so a positive eDNA sample would strongly suggest feeding on the grapevine, not just a passing presence. To ensure adequate time for eDNA to build up on foliage, all sampling took place at least 24 hr after any rain events with precipitation amounts ≥ ~3 mm.

Water collected during spray aggregation sampling was filtered immediately onsite using 10-micron polycarbonate track-etched filters housed in Whatman plastic assemblies (Cytiva, Marlborough, Massachusetts, USA), platinum-lined silicon tubing, and a Pegasus Alexis peristaltic pump (Proactive Environmental Products, Bradenton, Florida, USA). Filter assemblies and tubing were bleach-sterilized and triple rinsed with deionized water in the lab prior to deployment. The 10-micron filter ensures capture of mainly intracellular eDNA (Valentin et al., 2020a). For each filter sample, we processed as much of the collected water as possible until clogging. Heterogeneity in the amount of particulate matter on foliage among plots resulted in a range of volumes filtered (mean: 666 mL per filter; range: 170–2300 mL). A single filter was collected at most plots. At fourteen plots, two or, in one case, three filter samples were collected, but one filter was tested from each plot. We used clean forceps (flame-sterilized with ethanol between each use) to place filters into sterile 1.5 mL centrifuge tubes, and preserved samples by freezing at -20°C within ~4 hours of collection (n = 34 samples), or by immediately fixing in 100% ethanol (n = 60 samples).

During each vineyard visit, we also performed one field positive and one field negative control sample to evaluate our sample handling procedures and to test for equipment contamination, respectively. Field negative controls were collected upon arrival at each vineyard, before sampling, and involved spraying deionized water into the clean sample buckets, transferring all water into a single bucket to be filtered, and processing as above. Field positive sampling involved placing ~10 μl of previously collected lanternfly honeydew (see Valentin et al., 2020a) onto grapevine leaves or directly into buckets, and then using sprayers to suspend the sample in deionized water in collection buckets to be filtered and processed as above.

Laboratory methods

We dried ethanol-preserved samples prior to extraction using a vacuum centrifuge, and DNA from these samples and the frozen samples were extracted using the HotSHOT method (Truett et al., 2000) within three months of sample collection. We used a TaqMan genetic assay and real-time qPCR (StepOnePlusTM, Applied Biosystems, Foster City, California, USA) to test for the presence of lanternfly eDNA. Specifically, the assay targeted a 63 base-pair segment within the ribosomal gene ITS1 that has been demonstrated to be highly sensitive for and specific to spotted lanternfly (described in detail in Valentin et al., 2020a).

We ran all qPCR reactions in triplicate and considered a sample positive if one of the three technical replicates amplified. Reactions consisted of 500 nM of each primer, 250 nM of probe, 1x TaqMan® Environmental Mastermix II with no UNG, 2 μl of extracted template, and nuclease free water to achieve a total reaction volume of 20 μl. The optimized PCR reaction included an initial denaturing step of 96°C for 10 min, then 45 cycles of denaturing at 96°C for 15 s and annealing and extension at 60°C for 1 min (Valentin et al., 2020a).

We created a standard curve for each qPCR run of samples using five serial 1:10 dilutions of genomic DNA extracted from spotted lanternfly leg tissues using a Qiagen blood and tissue kit and protocol. We used a Qubit 2.0 fluorometer (Invitrogen, Life Technologies, Carlsbad, California, USA) to quantify DNA amounts in the most concentrated sample for each curve (range: 0.8–13.4 ng/µl). We estimate eDNA concentrations per reaction based on the standard curves (efficiencies = 71.0–77.5%; R2 = 0.961–0.997), and the quantities in the full field sample collection bucket by extrapolating based on the volumes of water collected and filtered. We estimated the 95% limit of detection (LOD) of the assay by fitting a three-parameter log-logistic dose response curve to the pooled standard curve data after Klymus et al. (2020). We included a negative control with each qPCR run to test for in-lab contamination; however we did not include a separate negative control during the extraction process. To reduce the chances of in-laboratory contamination, we performed all qPCR laboratory preparations in a dedicated ‘low-copy’ room and within a UV-sterilized positive-flow hood.

Statistical analyses

We analyzed the paired visual and eDNA detection data using Bayesian multimethod occupancy models (Nichols et al., 2008; Kéry & Royle, 2015) in JAGS within the R programing environment (Kellner, 2019; R Core Team, 2020). We used non-informative prior distributions for all parameters (see Kéry & Royle, 2015, p. 607), and posterior distributions were sampled using three Markov chains of 10000 burn-in and 40000 subsequent iterations, thinned by ten, resulting in 12000 total samples. We evaluated MCMC chains for convergence graphically using trace plots and based on the Gelman-Rubin statistic (< 1.1). Point estimates and credible intervals for parameters were calculated as the median and quantiles of the posterior distribution for each parameter.

Multimethod occupancy models are a type of multiscale model that allow detection histories for different methods to be nested within sampling replicates, permitting sharing of presence information at plots among methods. This allows more efficient estimation of method-specific detection probabilities (p), as well as occupancy rates at two different scales: Ψ, the estimated true proportion of vineyards occupied in the landscape; and θ, the estimated true proportion of nested plot-sized areas occupied within occupied vineyards. For example, a detection history for a single vineyard in our study could be [10, 11, 00, 10], with pairs of numbers indicating detection (1) or non-detection (0) using eDNA and visual surveys, respectively, at each of the four plots. This hypothetical detection history indicates that lanternflies were detected using eDNA at the first, second, and fourth plots, but visually detected only at the second plot. With multimethod occupancy models, the information that lanternflies were actually present at two plots recently enough to leave intact eDNA, but went undetected by visual surveys, can be used in calculation of p for visual surveys; in traditional single-scale occupancy models this information would be ignored.

Using the terminology of Nichols et al. (2008), we treated vineyards as the primary “sample unit”, the four plots as spatially replicated “sampling stations”, and the two methods as complimentary “devices” for detecting recent presence of lanternflies within the plots. The data generating process was assumed to be (Kéry & Royle, 2015):

Vineyard-level presence or absence status: zi ~ Bernoulli(Ψi);

Plot-level presence or absence status: aij | zi ~ Bernoulli(zi * θij);

Method-specific detection or non-detection measurement: yijk | aij ~ Bernoulli(aij * pijk);

where zi is the true presence or absence state for each vineyard (i); aij is the true presence or absence state for each plot (j) during the time of the survey; and yijd is the detection or non-detection measurement for each device (k) during each individual plot survey. Detection probability (p), plot-scale occupancy (θ), and vineyard-scale occupancy (Ψ) were described using linear models with logit link functions. The p sub-model included a covariate indicating eDNA or visual survey method, while the θ sub-model included a covariate indicating survey round one or two. The covariate for survey round allowed us to characterize the expected increase in prevalence of lanternflies within vineyards as they moved in from the surrounding woodlands over the course of the study (Leach & Leach, 2020). We did not evaluate any covariates for Ψ as the primary motivation for conducting this analysis was the estimation of p and θ to inform survey design.

To explore the survey design implications of our model, we used model-estimated parameters to calculate the cumulative probability of detecting lanternflies at occupied vineyards (pvineyard) as a function of increasing numbers of plots sampled. We used the formula pvineyard = 1-(1- θij*pijk )n, where n is the number of plots sampled and θij*pijk represents the combined probability that lanternflies are present or recently present in a given plot-sized area within an occupied vineyard and that they are detected by the method employed. We set the equation equal to 0.95 and solved for n to estimate the number of plots required for each method and sample period combination to achieve 95% probability of detecting lanternflies in at least one plot within an occupied vineyard.

In addition to overall method-specific detection probability, we further explored the detection probability of the eDNA-based method at the level of the individual qPCR technical replicate. To do this, we followed Dorazio and Erickson (2018) and fit a Bayesian multiscale occupancy model on the eDNA detection data alone. This model was identical in structure to the multimethod model, except that it treated data from each of the three qPCR technical replicates as the presence or absence data nested within plot samples. We were unable to combine this and the multimethod occupancy model into a single analysis because visual surveys contained no nested data within plot surveys. While this model shares the same three-level multiscale structure as the multimethod model, it only considers presence information from eDNA and therefore the interpretation of some parameters are changed as follows: θij is now the underlying probability of capturing eDNA within a water sample during each plot survey; aij is the realized presence or absence status of eDNA within each water sample; pijk is the probability of eDNA detection per technical replicate (k) given presence within the water sample; and yijk is the realized detection or non-detection measurement of each qPCR technical replicate. Importantly, this model allowed us to measure detection probability at the level of the individual qPCR technical replicate, permitting extrapolation using cumulative probability formulae regarding the impact of increasing numbers of technical replicates on sample-level detection probability.

Finally, we used a generalized linear model (function glm; R Core Team, 2020) to relate eDNA quantity per field sample to the visual count of lanternflies in the plots. For this analysis, we included only plots with confirmed lanternfly presence via eDNA and/or visual surveys, and only plots from round two, as round one samples only contained visual sightings of 0–1 individuals. We log transformed eDNA quantity and log(x+1) transformed visual counts prior to analysis to normalize the data.

Usage notes

The data for this project are provided in .csv format with rows corresponding to individual plot surveys and columns to the variables (plot information and measurements). The variables are described in detail in the file "NJ_vineyards_metadata.txt". The R code to reproduce analyses and figures described in the associated manuscript are presented in an RMarkdown (.Rmd) file. This can be opened in a text editor, but is best viewed and run in the free program RStudio ( The JAGS code for the Bayesian occupancy model is within the RMarkdown file.


United States Department of Agriculture, Award: accession 1010519

United States Department of Agriculture, Award: project NJ17380

United States Department of Agriculture, Award: 2019-67031-29684

United States Department of Agriculture, Award: 2019-51181-30014