Data from: Assessing flower-visiting arthropod diversity in apple orchards through metabarcoding of environmental DNA from flowers and visual census
Data files
Dec 24, 2025 version files 806.92 KB
-
5.Paper_Rscript.html
805.41 KB
-
README.md
1.51 KB
Abstract
Arthropods are essential in maintaining healthy and productive agricultural ecosystems. Agricultural crops such as apples are typically pollinated by domesticated honey bees, but wild bees and other arthropod flower visitors also contribute to pollination. Flower visitors can also be natural enemies of crop-pests or herbivores. Biodiversity is under pressure, and knowledge of wildflower visitors is an important tool in designing orchards that can support high functional biodiversity. In our study, we assessed the diversity of arthropod flower visitors in four Danish apple orchards using both molecular and non-molecular techniques to study arthropod communities in agricultural ecosystems. Arthropod DNA collected from apple flowers was analysed using a DNA metabarcoding approach using the mitochondrial COI marker, while arthropod pollinators were recorded through visual assessment surveys. These complementary techniques resulted in a total of 19 arthropod taxa detected. Non-bee arthropods constituted a large proportion of arthropods detected by both methods (84%, 16 taxa). Metabarcoding detected 12 taxa and had 83% species resolution. Visual census recovered flower visiting groups to the order level (Coleoptera, Diptera, Hymenoptera and Lepidoptera) but not species level, and also provided relative abundance data, which is not possible with molecular methods. We demonstrated that by utilising both molecular and non-molecular techniques to assess arthropod communities, we are able to obtain a broader overview of the arthropod fauna present. The methodology used and the outcome of this study can be used to inform and tailor suitable arthropod-pest management practices in orchards to increase crop yield and maintain healthy agricultural systems.
Dataset DOI: 10.5061/dryad.w0vt4b8vk
Description of the data and file structure
Assessing flower-visiting arthropod diversity in apple orchards through metabarcoding of environmental DNA from flowers and visual census. All files are stored in Zenodo under the Supplmental Information Related works link, except for file #5 which is hosted on Dryad.
Files and variables
File: 1._Supp1.xlsx
Description: Supplemental Tables S1 to S13
File: 2._Supp2.docx
Description: Supplemental 2- Figures S1 to S2
File: 3._My_code_pollinators.bash.zip
Description: Bioinformatic script used to generate metabarcoding data contained in Supp1
File: 4.Paper_Rscript.Rmd
Description: R scripts used to generate metabarcoding and visual census data (Rmd) contained in Supp1 and Supp2
File: 5.Paper_Rscript.html
Description: R scripts used to generate metabarcoding and visual census data (html) contained in Supp1 and Supp2
File: 6._Paper_Rscript.zip
Description: .csv files used for stats test on R-studio. From TableI1 to TableI8
File: 7.READ_ME_Publication.txt
Description: Description in .txt of everything listed before
Code/software
The software needed for this data are:
Excel, word in Office package, Notepad++ and Rstudio.
Arthropods were sampled in four apple orchards on Sealand, Denmark; one located 20 km north of Copenhagen (Frydenlund), two located 25 km and 37 km south-west of Copenhagen respectively (Kildebrønde, and Ventegodtgaard), and finally the Pometum, located 16 km west of Copenhagen, belonging to University of Copenhagen (Fig. 1A and see Supplemental information 1 Table S1). Sites were separated by at least 9 km and located in an agricultural matrix. While Frydenlund and Kildebrønde were relatively large orchards with more than 20 rows of apples (more than 100m wide), apple plots at the Pometum and Ventegodtgaard were only 7 and 10 rows wide respectively (less than 40m wide). Ventegodtgaard was managed organically while the other three followed integrated pest management (IPM). We used the term organic as eco-friendly farming managed according to the Danish state approval. In Denmark, IPM management permits the use of few insecticides, principally pyrethroids. Only Pometum orchard had a treatment with pyrethroid against winter moth larvae. The orchards had honey bee hives placed either within the field (Frydenlund and Kildebrønde) or in the surrounding crops (the Pometum and Ventegodtgaard).
Arthropods were sampled in four different distances from the margin of the orchard (side of orchard where flowers strip was sown): row 1 (first row of apple trees 0 m from the margin, at the edge of the orchard), row 3, row 5 and row 10. These rows are approximately 5, 10 and 25m from the margin of the orchard, respectively (Fig. 1B). In the smallest orchard, at the Pometum, there was more distance to the flower strip as there was a 10 m grass strip. In addition, tree rows in the Pometum were wider apart so the last row in this orchard was row 7 and was also approximately 25m away from the margin, facing a strip of grass followed by a pear orchard (Fig. 1C). Finally, in Ventegodtgaard, row 10 was the final row before a hedgerow (Fig. 1B). The vegetation surrounding the orchards differed in species richness and abundance across the orchards. Frydenlund margin had eight different flower species present when sampled. Pometum had twelve flower species and pear trees flowering near the apples. On the other hand, Ventegodtgaard and Kildebrønde only had one species of flower and the margin was mainly composed of grass (See Supplemental information 1 Table S2). Sampling was conducted when the percentage of open apple flower buds was between 50-90%. This occurred in late May 2020 for two weeks until the end of the apple flowering period.
Metabarcoding of apple flowers
We followed the methods from Thomsen and Sigsgaard (2019) to analyse the environmental DNA (eDNA) present in the sampled apple flowers. In the morning, five individual apple flowers were collected in rows 1, 5, and 7 or 10 of each orchard. In comparison to Thomsen and Sigsgaard (2019) who collected 56 flowers, we collected a total of 60 flowers picked individually and stored in separate sterile plastic tubes (50mL, Thermo Scientific). Collection was done using single-use sterile nitrile gloves to avoid contamination. Flower samples were stored at -20ºC prior to DNA extraction
DNA extraction
DNA extraction was carried out at the Department of Plant and Environmental Science laboratories, University of Copenhagen. The experiment was performed in a PCR-free laboratory to prevent contamination. DNA was extracted using the Qiagen DNeasy Blood & Tissue Kit and protocol in a flow hood. First, the whole apple flower was transferred to a 2ml Eppendorf tube prior to DNA extraction. Lysis was performed by adding 900µl of a cell lysis solution (ATL buffer) and 100 µl of proteinase K. Samples were disrupted in the TissueLyser II for 2 minutes at 30Hz and incubated at 56ºC with agitation in a rotor for 3 hours. Samples were vortexed for 10 seconds before transferring 800 µl of the lysis mixture to a new 2ml Eppendorf tube. 800 µl of lysis buffer (AL buffer) was added and the mixture was mixed thoroughly by vortexing before incubation at 56ºC for 10 minutes. 800 µl of absolute ethanol was added to the mixture, followed by vortexing before adding the mixture to the spin columns. The mixture was spun through the membrane filter over three rounds (700 µl per round) with 1.5 minutes of centrifugation at 8000rpm after each round. The flow-through was discarded every round. The Spin columns were washed by adding 600 µl of wash buffer (AW1) and centrifuged for 1.5 minutes at 8000rpm, followed by adding 600 µl of AW2 and centrifuged for 3.5 minutes at 14000rpm. Each spin column was transferred to a new 2ml Safe-lock Eppendorf tube and DNA was eluted in 2x60 µl AE buffer with a 15 minute incubation step at 37ºC before centrifugation (1.5 minutes at 10000rpm). One extraction blank was included at the beginning of the process to test for possible contamination during the procedure. DNA extracts from apple flowers collected from the same row, but different apple trees in the same orchard, were pooled according to the DNA concentration and 260/280 ratio, measured by Microvolume spectrophotometer (mySPEC, VWR) (See Supplemental information 1 Table S3). This resulted in a total of 36 pooled DNA extracts and one extraction blank stored at -20ºC prior to further analysis
PCR amplification
All DNA extracts including the extraction blank were sent to AllGenetics & Biology SL (www.allgenetics.eu) for PCR amplification and sequencing. For eDNA flower metabarcoding, a 157 bp fragment of the COI genomic region was amplified using the primers ZBJ-ArtF1c (5’ AGA TAT TGG AAC WTT ATA TTT TAT TTT TGG 3’) and ZBJ-ArtR2c (5’ WAC TAA TCA ATT WCC AAA TCC 3’) (Thomsen & Sigsgaard, 2019; Zeale et al., 2011). Three PCR replicates were generated for each sample. PCR reactions for each sample were carried out in a final volume of 12.5 μL, containing 1.25 μL of template DNA, 0.25 μM of the primers, 6.25 μL of Supreme NZYTaq 2x Green Master Mix (NZYTech), CES 1x, and ultrapure water up to 12.5 μL. The PCR reaction was incubated as follows: an initial denaturation step at 95 ºC for 5 minutes, followed by 35 cycles of denaturing at 95 ºC for 30 seconds, annealing at 49 ºC for 45 seconds, 72 ºC for 45 seconds, and a final extension step at 72 ºC for 7 minutes. The oligonucleotide indices which are required for multiplexing different libraries in the same sequencing pool were attached in a second PCR round with identical conditions but using only five PCR cycles and 60 ºC as the annealing temperature. A PCR blank that contained 1.25 μL of ultrapure water instead of DNA (BPCR) was included to check for contamination during PCR library preparation. Additionally, PhiX Control v3 (Illumina) was used as a control library for Illumina sequencing runs. The PCR libraries were run on 2 % agarose gels stained with GreenSafe (NZYTech) and imaged under UV light to verify the library size (227pb). PCR libraries were purified using the Mag-Bind RXNPure Plus magnetic beads (Omega Biotek), following the instructions provided by the manufacturer. PCR libraries were pooled together in equimolar amounts. Pooled PCR libraries were purified using the Mag-Bind RXNPure Plus magnetic beads (Omega Biotek). The pooled PCR libraries were sequenced in 3/16 of an Illumina MiSeq PE300 run
Data analysis
Illumina paired-end raw files consist of forward (R1) and reverse (R2) reads sorted by PCR libraries and quality scores. The indices and sequencing Illumina adapters were trimmed during the demultiplexing step. Any remaining Illumina adapters were removed using the software CUTADAPT (Martin, 2011). The resulting trimmed sequences were used for further analysis. The analysis of the trimmed sequences was carried out using the OBITools package, which allows sorting and filtering of sequences based on the taxonomy (Boyer et al., 2016). A total of 36 samples, one extraction blank (sample 37), and one PCR blank (BPCR) were included in the dataset. Forward and reverse reads were first merged (ILLUMINAPAIREDEND) and unaligned sequence records were removed with an alignment score below 40 (OBIGREP). Reads were dereplicated into unique sequences by OBIUNIQ. Sequences with only a single copy (singletons) and shorter than 100 bp were removed by OBIGREP. Amplification and sequencing errors generated during PCR and sequencing were identified and cleaned with OBICLEAN, using a threshold ratio of 5% (De Barba et al., 2014). For Taxonomic assignment, the EMBL reference database (Deiner et al., 2017) was built through ecoPCR, as it is one of the main public databases used for taxonomic assignment of COI insect sequences (Meiklejohn et al., 2019). Taxonomic assignment for Zeale primers (Zeale et al., 2011) was performed using the ECOTAG program, which compares each sequence of the data set to the created taxonomic database (Boyer et al., 2016). Post-OBITools sequence filtering and merging of the taxonomic assignments were carried out with R-studio 1.4.1103 (R Core Team, 2020). Following the analysis detailed in Chua et al. (2021), only sequences that fulfilled the following criteria were kept: i) matched 98% of our reference database, ii) had a minimum of three reads of each taxon observed within each PCR replicate (See Supplemental information 1 Table S4), and iii) occurring in at least two out of three PCR replicates of a sample (Ficetola et al., 2015; Rasmussen et al., 2021b) (See Supplemental information 1 Table S5). These criteria were followed to minimise the risk of false positives. No sequences were found in the extraction blank and PCR blank when checking for possible contamination. Only taxa that belonged to the arthropod phylum were kept (See Supplemental information 1 Table S6). Sequences from the final dataset after filtering were also matched to the Barcode of Life Data System (BOLD) to check for discrepancies. For mismatches between the EMBL and BOLD database, we carried out the following i) BOLD identification was kept if it was at a lower taxonomic resolution than EMBL, ii) for mismatches at the species-level, we reclassified the taxa to genus level if neither or both species can be found in Denmark (See Supplemental information 1 Table S6). The resulting output was grouped according to four different orders: Blattodea (BT), Coleoptera (CP), Diptera (DI), and Lepidoptera (LP) (See Supplemental information 1 Table S7).
Visual census
The data from the visual assessment was generated as part of the Beespoke project, where the effect of arthropod flower visitor diversity and abundance was assessed as a function of distance from a flower strip. The visual assessment protocol was adapted from Westphal et al. (2008). An observer noted all the arthropod flower visitors 2.5 meters on each side of the observer (covering two rows of apple trees) at each of the four different distances previously established from the margin of the orchard. The type and number of arthropods were recorded for five minutes in each observational transect walk. Arthropods were identified to morpho-groups: Bumblebee (BB), Coleoptera (CP), Diptera (DI), honey bee (HB), Lepidoptera (LP), Syrphid (SY), and wild bee (WB). This classification was established following previous studies on typical pollinators found in apple orchards (Ramírez & Davenport, 2013). We did not sort Hymenoptera to family level but instead sorted them according to honey bees, wild bees and bumblebees as foraging habits and activity differ within the same family (Delaplane et al., 2000; Földesi et al., 2020; Gardner & Ascher, 2006; Pardo & Borges, 2020). Transects were performed at each orchard during two different times of the day; mornings (between 9:00 and 12:00) and afternoons (between 13:00 and 16:00). We sampled twice in Kildebrønde, the Pometum, and Ventegodtgaard, once in the morning and once in the afternoon. We sampled thrice in Frydenlund, once in the morning and twice in the afternoon. Surveys were only carried out when wind speed was not exceeding 7m/sec and a threshold temperature of 10ºC on sunny days, and 15ºC on overcast days (Ramírez & Davenport, 2013). These factors decided when to select the days and times for sampling (using regional weather forecast from Danish Meteorological Institute)
Statistical analysis
For metabarcoding, statistical analysis was carried out using the R package vegan (Oksanen et al., 2020). Only taxonomic units assigned to Arthropoda and present in Denmark were considered (See Supplemental information 1 Table S8). Sequence counts were analysed in two different ways; presence/absence of each insect taxa measured using the frequency of occurrence (Fo), and relative read abundance (RRA) obtained by proportional summaries of counts (Deagle et al., 2019). Read counts were transformed into RRA data using the decostand function from R package vegan (Oksanen et al., 2020). Summaries based on occurrence data (Fo) are more sensitive to rare taxa and pooled samples than RRA (Deagle et al., 2019). Thus, results were discussed using both RRA and Fo values. We used order as a taxonomic unit for metabarcoding diversity analysis to compare results from both methodologies.
For visual assessment, the sampling effort was different across the orchards. Due to meteorological conditions and orchard location, the four orchards were sampled on a different number of days. Honey bee (HB) data was not included in the arthropod richness analysis as it occurs in Denmark only as a managed species (Rasmussen et al., 2021a). We considered morpho-groups as taxonomic units for arthropod composition analysis. The effect of honey bee hives on wild pollinators was studied. Orchards with honey bee hives placed within the field were considered as orchards with close honey bee hives (Kildebrønde and Frydenlund), while those with hives found in the surrounding crops were established as orchards with distant hives (Ventegodtgaard and the Pometum). Wild pollinators were defined as non-honey bee insects. Chi-squared test was used to test for dependency between the variables while the type of relationship was measured by an ODDS ratio and Risk estimate from R package fmsb (Nakazawa, 2021).
To avoid biased results due to the small sample size and sampling effort, rarefaction tests were carried out for both the molecular and non-molecular analyses to assess sampling completeness, and the relationship between arthropod richness and type of orchard (Gotelli & Chao, 2013; Russo et al., 2015). We used arthropod order and the orchard identity to compare both the molecular and non-molecular methodologies. Arthropod communities were compared using 95% confidence intervals (Chao1 estimator) of the rarefaction curves and extrapolation of Hill numbers (species richness (q=0)). Differences across the expected diversities are significant when 95% confidence intervals do not overlap (Chao et al., 2014). The R package iNEXT (Chao et al., 2014; Hsieh et al., 2016) from R version 4.0.3, based on the bootstrap method, was used to assess the uncertainty of the proposed sample completeness measure. Orchard (Frydenlund, Kildebrønde, the Pometum, and Ventegodtgaard) was used as a sampling effort unit. For the molecular analysis, only occurrence data (Fo) was used as it represented the presence and absence data of the different genera in the orchards.
- Gomez, Nerea Gamonal; Sørensen, Didde Hedegaard; Shi Chua, Physilia Ying; Sigsgaard, Lene (2022). Assessing flower-visiting arthropod diversity in apple orchards through environmental DNA flower metabarcoding and visual census [Preprint]. Cold Spring Harbor Laboratory. https://doi.org/10.1101/2022.01.24.477478
-
Gamonal Gomez, Nerea; Sørensen, Didde Hedegaard; Chua, Physilia Ying Shi; Sigsgaard, Lene (2022). Assessing flower‐visiting arthropod diversity in apple orchards through metabarcoding of environmental
DNA from flowers and visual census. Environmental DNA. https://doi.org/10.1002/edn3.362
