Monitoring the birds and the bees: Environmental DNA metabarcoding of flowers detects plant–animal interactions
Data files
Sep 20, 2023 version files 3.08 GB
-
bc_16Smam.txt
-
bc_fwh1.txt
-
bc_fwh2.txt
-
bc_ND2_1.txt
-
bc_ND2_2.txt
-
MSRun494-FTP370_S1_L001_R1_001.fastq.gz
-
OTU_matrix_16s.csv
-
OTU_matrix_fwh.csv
-
OTU_matrix_nd2.csv
-
README.md
Abstract
Animal pollinators are vital for the reproduction of ~90% of flowering plants. However, many of these pollinating species are experiencing declines globally, making effective pollinator monitoring methods more important than ever before. Pollinators can leave DNA on the flowers they visit, and metabarcoding of these environmental DNA (eDNA) traces provides an opportunity to detect the presence of flower visitors. Our study, collecting flowers from seven plant species with diverse floral morphologies, for eDNA metabarcoding analysis, illustrated the value of this novel survey tool. eDNA metabarcoding using three assays, including one developed in this study to target common bush birds, recorded more animal species visiting flowers than visual surveys conducted concurrently, including birds, bees, and other species. We also recorded the presence of a flower visit from a western pygmy possum; to our knowledge, this is the first eDNA metabarcoding study to simultaneously identify the interaction of insect, mammal, and bird species with flowers. The highest diversity of taxa was detected on large inflorescence flower types found on Banksia arborea and Grevillea georgeana. The study demonstrates that the ease of sample collection and the robustness of the metabarcoding methodology have profound implications for future management of biodiversity, allowing us to monitor both plants and their attendant cohort of potential pollinators. This opens avenues for rapid and efficient comparison of biodiversity and ecosystem health between different sites and may provide insights into surrogate pollinators in the event of pollinator declines.
Methods
Study site
Our site was within the Helena and Aurora Range (Kalamaia name: 'Bungalbin'), in the Goldfields-Esperance region of Western Australia (Fig. 1). We undertook two visits: one in spring time September 2020, when visual pollinator surveys for insect, bird and mammal pollinators were conducted for six focal flowering species – Acacia adinophylla, Eremophila clarkei, Eremophila oppositifolia, Grevillea georgeana, Leucopogon spectabilis and Tetratheca aphylla subsp. aphylla, followed by the collection of flowers for eDNA analysis, and one in autumn May 2021, when the same procedures were repeated for Banksia arborea, which was not flowering on the first visit. These plants represented a range of species with different flower morphologies and different assumed pollinators. Furthermore, many of the plant species sampled are of conservation concern, with little information on pollinating taxa currently available (Fig. 2; supplementary material 5).
Flower sample collection
A total of 175 samples were taken from the same individual plants surveyed for bird and insect pollinators, and subsequently used in the eDNA metabarcoding study. Based on morphology and size, flowers were assigned to one of three categories: large inflorescence, small inflorescence, or single flower (Fig. 2). From the seven plant species, five complete flowers or inflorescences were collected from five individual plants totalling 25 flowers/inflorescences per species. To prevent contamination, single-use sterile nitrile gloves were worn during sampling and all flowers were collected in individual sterile plastic tubes. All samples were then stored on ice in a well-insulated polystyrene ice box (60 mm walls) after sampling and stored below -20 °C after returning from the field (within 48 hrs) until DNA extraction.
Bird surveys
Bird pollinator surveys were undertaken during peak foraging times for birds: before 1000 and after 1500 (Gilpin et al., 2017). Each plant was surveyed once during each time period, with all visible inflorescences on each focal plant observed simultaneously for 20 minutes from a distance of approximately 20 m to minimise disturbance (Gilpin et al., 2017). However, where it was not possible to survey from this distance due to vegetation and terrain restraints, surveys were undertaken from no less than 10 m and as close to 20m as possible. The specific plant and observation time for each plant was chosen at random to avoid temporal bias, and the species and number of birds that interacted with flowers recorded. A full list of the taxa observed are listed in Table S1.
Insect surveys
Insect pollinators were surveyed on the same individual plants surveyed for bird pollinator surveys during daylight hours when temperatures were warm enough for bee activity (approx. 0900–1600, >16 °C) by a single experienced entomologist (K.S.P). Surveys were conducted on each plant for 10 minutes, approximately 1 m away from the plants, except for flora that were inaccessible (L. spectabilis). Visiting insect taxa were classified to the lowest taxonomic level possible via visual observation in the field. For the purposes of analyses, insect taxa observed were collapsed to family level; a full list of the taxa observed are listed in Table S2.
Nocturnal surveys
Nocturnal surveys were conducted approximately 40 minutes after sunset using torches for 60 minutes throughout the study site on two consecutive nights to detect nocturnal mammal pollinators. No fauna was seen.
Bird primer design
Searches for available mitochondrial DNA sequences of 134 common bush birds of South-western Australian (SWWA) were conducted using NCBI’s nucleotide database (Altschul et al., 1990). The mitochondrial gene NADH dehydrogenase subunit 2 (ND2) provided the most extensive number of available sequences (120 species, each ~1,040 bp length). The identified avian ND2 sequences were downloaded from NCBI (Altschul et al., 1990) and aligned using the MUSCLE plugin (default settings) in Geneious v11.0.12 (Kearse et al., 2012). Based on the alignments, candidate universal primer sets were designed using Primer3 within Geneious (Koressaar & Remm, 2007; Untergasser et al., 2012). Candidate primers were then visually assessed to determine the best sites for primer pairing and maximum taxonomic resolution within the internal amplified regions. To allow primer annealing to polymorphic sites, both the forward and reverse primers were modified to include degenerate bases, and primer characteristics were analysed (G/C content, temperature, secondary structures, primer pair compatibility) to further improve the choice of primer using both Primer3 within Geneious (Koressaar & Remm, 2007; Untergasser et al., 2012) and the online bioinformatics tool Sequence Manipulation Suite: PCR Primer Statsn (Stothard, 2000). As a result, the following ND2 assay was designed to generate an approximately 229 bp amplicon (excluding primers): BirdND2F 5’ CCATTCCACTTYTGRTTYCC 3’ and BirdND2R: 5’GGGAGATDGADGARAADGC 3’.
in silico analysis was then performed to test the specificity of the priming sites for target taxa and resolution of internal amplified regions. Specificity of the primer set was tested through in silico PCR utilizing the “search_PCR” command in USEARCH v11.0.667 (Edgar, 2010), targeting both the newly generated target bird species list and custom generated full-length mitochondrial DNA reference databases (Sakata et al., 2022). These DNA reference databases were comprised of 18,283 mammal, 10,594 fish, 4,510 bird, 1,461 reptile and 746 amphibian sequences (35,597 vertebrate sequences). in silico PCR was performed under the following parameters: min amplicon size: 100, max amplicon size 1000, mismatch 0, 1, 2, 3. To determine the dissimilarity of internal amplified regions generated from both the custom target taxa database and full-length mitochondrial bird database the “dist.dna” function within the “ape” package (Paradis & Schliep, 2019) was used in R v.4.0.3 (R Core Team, 2013).
To assess PCR amplification, efficiency, sensitivity, and optimal annealing temperature, in vitro testing was conducted using the gDNA of two bird species: Red-tailed Black Cockatoo (Calyptorhynchus banksii) and Rainbow Lorikeet (Trichoglossus moluccanus). A temperature gradient from 51 to 61 °C, in combination with seven tenfold gDNA dilutions from 1:10 to 1:1012, inclusive of non-template controls (NTCs), were evaluated. The temperature gradient showed 53 °C as the optimal annealing temperature with NTCs undetected and gDNA detected at all levels of dilution from the qPCR assay.
To determine the efficacy of the newly developed assay, in vitro testing was conducted on bird bath water (n = 8; 500 mL each) from Perth, Western Australia, and a mixed positive genomic DNA (gDNA) sample containing Red-tailed Black Cockatoo (Calyptorhynchus banksii), Rock Dove (Columba livia), and Southern Boobook (Ninox novaeseelandiae) gDNA. Bird bath water samples were filtered through a Pall 0.45μm GN-6 Metricel® mixed cellulose ester membranes using a Pall Sentino® Microbiology pump (Pall Corporation, Port Washington, USA) and frozen at -21 °C prior to extraction. Filtered birdbath samples were extracted and amplified alongside mock tissue samples using birdND2 and 12S-V5 F2/R2 (Riaz et al., 2011) assays for comparison and then sequenced as described below.
Sample processing and DNA extraction
All laboratory processes were conducted in dedicated laboratories within the Trace & Environmental DNA (TrEnD) Laboratory, Curtin University, Perth, Western Australia. Large inflorescences (B. arborea, G. georgeana) were individually placed in decontaminated plastic containers and covered with purified distilled water (~ 500 mL) for 10 minutes, manually agitated twice for 30 seconds at 5-minute intervals. Water samples were then individually filtered across Pall 0.45μm GN-6 Metricel® mixed cellulose ester membranes using a Pall Sentino® Microbiology pump (Pall Corporation, Port Washington, USA) and frozen at -21 °C prior to extraction. For all but A. adinophylla samples, DNA was extracted using a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) with lysis performed by either adding a small flower, small inflorescence, or half filter membrane to a 2 mL plastic tube, adding 60 μL of Proteinase K, 540 μL of ATL lysis buffer and digesting for 17 hours at 56°C. For A. adinophylla inflorescences, all extraction parameters remained the same except for 120 μL of Proteinase K and 1080 μL of ATL lysis buffer added due to the absorbent property of the flowers. DNA was then extracted from the digest supernatant using the QIAcube extraction platform (Qiagen, Hilden, Germany). To detect possible cross-contamination during sample filtering, 500 mL of bleach solution, used for rinsing the pump containers between samples, was filtered and processed in the same way as field samples (n =2). Similarly, digest (n = 1) and extraction controls (n = 4) were processed with each batch of samples.
Assessment of DNA extracts
Two assays were used to amplify eDNA from all flowers (n = 175), fwhF2/fwhR2n (F: 5’-GGDACWGGWTGAACWGTWTAYCCHCC-3’; R: 5’- GTRATWGCHCCDGCTARWACWGG-3’; Vamos et al., 2017), hereafter referred to as fwh, designed to amplify a 205 bp fragment of the arthropod cytochrome c oxidase I (COI) gene region and birdND2 (229 bp) designed to target local bird species (see above). On all B. arborea (n = 25) and G. georgeana (n = 25) flowers, the 16Smam1/2 assay (F: 5’-CGGTTGGGGTGACCTCGGA-3’; R: 5’-GCTGTTATCCCTAGGGTAACT-3’; Taylor, 1996), hereafter referred to as 16Smam, designed to amplify a ~130 bp fragment of mammalian 16 S ribosomal gene region was also used to detect small mammals known to pollinate these taxa.
To assess the quality and quantity of DNA in each extract, and determine the optimal level of DNA input, quantitative PCR (qPCR) was carried out on three PCR replicates per sample at Neat, 1:10, and 1:100 dilutions (Murray et al., 2015). All qPCR reactions were performed on a StepOnePlusTM Real-Time PCR system (Applied Biosystems, Massachusetts, USA) with reaction volumes totalling 25 μl. Reactions contained 1× PCR Gold buffer (Applied Biosystems), 2.5 mM MgCl2 (Applied Biosystems), 0.4 mg/ml BSA (Fisher Biotec, Australia), 0.25 mM of each dNTPs (Astral 204 Scientific Australia), 0.4 μM forward and reverse primer, 1 U AmpliTaq Gold (Applied Biosystems), 0.6 μL SYBR Green (Life Technologies, USA) and 2 μL of template eDNA. Cycling conditions were initial denaturation at 95 °C for 5 min, followed by 55 cycles at 95 °C for 30 sec, then 30 sec at primer specific annealing temperature (50 °C for fwh, 53 °C for birdND2 and 55 °C for 16Smam), and 45 sec at 72 °C, ending with 10 min elongation at 72 °C. To detect possible contamination a non-template control (reagents only) was used, and a positive control to ensure efficacy of PCR reagents and cycling conditions.
DNA amplification and sequencing
Based on the qPCR amplification (detailed above), DNA extracts with sufficient DNA quantity and quality, were assigned a unique 6-8 bp multiplex identifier tag (MID tag) for each birdND2, 16Smam and fwh primer set. Independent MID tag qPCRs were prepared in a physically separate DNA-free laboratory, in duplicate, on each potentially positive extract and control with qPCR solutions containing unique combinations of MID tag (fusion) primers with the same reagents and cycling conditions described above. Template eDNA volumes varied based on the qPCR amplification ΔRn values, CT values, and melt curves generated, resulting in reaction volumes ranging from 25–35 μL containing 2-12 μL of template eDNA (Table S3). MID tag amplicons were then pooled together with other MID-tag amplicons according to ΔRn values. Pooled samples were then quantified using the QiAxcel Advanced System (Qiagen, Hilden, Germany) and concentrations (ng/µL) used to combine all pools in appropriate equimolar ratios creating a DNA sequence library. Size selection of amplicons (200 – 400 bp) was then performed using a PippinPrep (Sage Science, Beverly, USA) before samples were purified using a QIAquick PCR Purification Kit (Qiagen, Hilden, Germany), and re-quantified using a Qubit Fluorometer (Invitrogen, Carlsbad, USA). Sequencing was then performed on an Illumina MiSeq platform (Illumina, San Diego, USA), as per Illumina protocols for single-end sequencing (MiSeq® v2 Reagent Kit 300 Cycles PE), with a final library molarity of 6 pM containing 7 % PhiX.
Sequence analysis (filtering and taxonomic assignment)
Sequence filtering and taxonomic assignment were conducted using the eDNAFlow bioinformatics pipeline (Mousavi‐Derazmahalleh et al., 2021) via the supercomputer Magnus, based at the Pawsey Supercomputing Centre in Kensington, Western Australia. Quality checking of sequence reads was performed using FASTQC (Andrews, 2010), and quality filtered (minimum Phred quality score of 20), including trimming sequences with Ns employing AdapterRemoval v2 (Schubert et al., 2016). Demultiplexing was achieved using obitools (ngsfilter; Boyer et al., 2016) and sequences under 60 bp removed (obigrep). Dereplication and the creation of zero-radius operational taxonomic units (ZOTUs; min sequence abundance = 4) and ZOTU tables were then generated using the USEARCH unoise3 algorithm (Edgar, 2016). The ZOTUs were queried using the following parameters: % identity ≥ 97, evalue ≤ 1e−3, % query cover 100, max target sequences = 10 via a BLASTN search (Altschul et al., 1990) against NCBI’s GenBank nucleotide database (accessed January 2022). Erroneous ZOTUs were identified and removed using the post clustering curation method LULU with the minimum threshold of sequence similarity at 95 (Frøslev et al., 2017). A custom python script (Mousavi‐Derazmahalleh et al., 2021) was then used to taxonomically assign ZOTUs by the lowest common ancestor (LCA) approach, with taxonomic assignment collapsed to the LCA if the percent identity of two hits with 97 % query cover and 95 % identity, differed by less than 1 %.
Further filtering of the metabarcoding data was performed using the “phyloseq” package (McMurdie & Holmes, 2013) in R v.4.0.3 (R Core Team, 2013). Initially, sequence counts in eDNA samples below the threshold value of 5 were discarded. Following this, ZOTUs present in negative controls, common PCR contaminants (e.g., human, and ungulate sequences), and taxonomic assignments higher than family level were removed from the dataset. Species likely not present at the sample site, were also removed from the data set or collapsed to genus or family level based on species distribution data (e.g., Cymbacha similis to Cymbacha sp. and Exorista deligata to Exorista sp.). Genera Colluricincla and Manorina were then reassigned to Colluricincla harmonica (Grey Shrike-thrush) and Manorina flavigula (Yellow-throated Miner) respectively, based on existing species distribution data. ZOTUs from all metabarcoding assays were combined for further analysis and individual flower samples agglomerated to plants. The ZOTU table was then transformed to presence-absence data for community comparisons.
Statistical analysis
Statistical analysis was performed in R v.4.0.3 (R Core Team, 2013). Initially, animal ZOTU richness was calculated for each flowering plant species using the “phyloseq” package (McMurdie & Holmes, 2013). Species accumulation curves were then calculated using the number of observed ZOTUs by number of flower samples using the “accumcomp” function in the “BiodiversityR” package (Kindt & Coe, 2005) with A. adinophylla and L. spectabilis removed due to the detection of a taxa on only a single plant. Observed ZOTU richness was then square-root transformed to meet the assumption of normality and homogeneity of variance and an analysis of variance (ANOVA) used to compare the square-root transformed ZOTU richness detected between plant species (Factor = plant species, level 6). A Tukey-HSD post-hoc test was then run to assess the significant differences between plant species. Data was visualized using the package “ggplot2” (Wickham, 2016).
A Kruskal-Wallis rank sum test was used to compare the observed ZOTU richness between flower types (Factor = sample type, levels 3; large inflorescence, small inflorescence, single flower). A pairwise Wilcox post hoc test was then run to assess the significant differences between flower types with a Benjamini-Hochberg correction for multiple comparisons with visual surveys.
Data were then transformed to presence-absence data and a one-way permutational MANOVA (PERMANOVA) was conducted to compare the ZOTU presence/absence composition between the different species using the ‘vegan’ package (Oksanen, 2009) with Jaccard similarity and 9999 permutations. Pairwise comparisons were performed using the ‘PairwiseAdonis’ (Arbizu, 2020) package with a Benjamini-Hochberg correction for multiple comparisons. A Sankey plot was created using R package 'network3D' (Allaire et al., 2017) to show the links between plants and all taxa detected within this study.
To make direct comparisons with visual surveys, while bird and mammal species identifications were retained at species level, all arthropods detected using eDNA metabarcoding were collapsed to family level (see insect survey methods outlined above). The similarity between eDNA metabarcoding and conventional visual surveys was then calculated between species presence/absence matrices per plant (matrices of species x plants surveyed) using a Mantel test (Pearson method, 999 permutations) on dissimilarity indices calculated using the Jaccard method, computed with the function ‘mantel’ of the ‘vegan’ package (Oksanen, 2009).