Skip to main content

Elk food habits using DNA metabarcoding for plant identification

Cite this dataset

Muller, Lisa (2024). Elk food habits using DNA metabarcoding for plant identification [Dataset]. Dryad.


North American elk (Cervus canadensis) inhabited portions of the Eastern United States until extirpation in the mid-1800s. From 2000 to 2008, 201 elk were reintroduced to the North Cumberland Wildlife Management Area (NCWMA), Tennessee. The stocking source was Elk Island National Park, Alberta Canada where there are two distinct genetic populations isolated from the north and south. This genetic structure has largely persisted in the population after translocation. Food habits were evaluated in the early stages of restoration, but the population has had approximately 20 years to adapt to the landscape, and current food habits are unknown. To assess diet composition using DNA metabarcoding, we collected fecal pellets of elk from 65 openings within the 79,318-ha NCWMA weekly from February to April of 2019. We targeted the ITS2 region of the nuclear ribosomal DNA to amplify vegetation sequences found in the internal portion of the elk feces. DNA metabarcoding of feces was linked to results from an accompanying elk population genetics study to investigate food habits between sexes from the two different genetic groups. The majority (80.298%) of sequences matched plants from 22 genera. The top genera (>5.000%) represented were Vaccinium (15.216%), Festuca (8.446%), Rosa (6.358%), Robinia (5.793%), and Eleagnus (5.186%). Elk heavily used woody plants before and after spring green-up (>50% of diet). However, the quantity of forbs in their diet more than doubled after emergence in the spring.  The sex-genetic groups  consumed similar vegetation in approximately proportionate amounts. Diversity analyses revealed a significant difference in plant genera sequence detection between males from the two genetic groups, although this finding is likely explained by limited sample size. NCWMA elk used a variety of forage in the winter and DNA metabarcoding analysis allows for a comprehensive analysis of food habits useful for monitoring how elk respond dietarily to habitat management.

README: Elk food habits using DNA metabarcoding

Percentages of food items were from elk fecal samples collected in East Tennessee. We targeted the ITS2 region of the nuclear ribosomal DNA to amplify vegetation sequences found in the internal portion of the elk feces. Percentages of food items by sex (M or F) and Genetic Group (Elk Island National Park North (EINPN) and Elk Island National Park South (EINPS)) of source population.

Description of the data and file structure

Basic excel file of plant genera by percentage in the diet. Each food item is listed by plant genus and includes percentage of occurrence for each individual elk. The elk are identified by male or female and what genetic group their ancestors came from the original stocking source. These samples were collected From February to April 2019 in the North Cumberland Wildlife Management Area of Tennessee. Percentages are listed for each individual elk (with information about sex (M or F) and genetic group (EINPN or EINPS)) for each plant genera detected. Sex-Grouping is a combination of sex and genetic group. A value of 0 means there was no detection for that genus. Otherwise, the general percentages are calculated by the percentage of sequences found in that individual sample. Sample identification is listed in Column A. Any value starting with a "C" were from captured elk and fecal samples collected directly from the anus. Samples starting with an "S" were from fecal pellets collected in the field.

Sharing/Access information

Please contact authors for further information regarding use. We can provide GPS coordinates for all fecal samples collected.


We conducted this study within the NCWMA (79,318-ha) in the 271,145-ha Elk Restoration Zone that spans Scott, Campbell, Morgan, Claiborne, and Anderson counties (Figure 1; TWRA 2018). The NCWMA is made up of 86% deciduous forest, 12% openings from reclaimed coal strip mines and fields, and 1% cropland (TWRA 2000). Cabrera (1969) described the NCMWA as a mixed-mesophytic forest which included sugar maple (Acer saccharum), yellow-poplar (Liriodendron tulipifera), basswood (Tilia americana), buckeye (Aesculus flava), northern red oak (Quercus rubra), and chestnut oak (Quercus montana). Once an area used for strip, bench, and deep coal mining, the NCWMA has been left with shelves and benches, some of which (approximately 300 ha) have been converted to wildlife openings containing tall fescue (Festuca arundinacea) and Lespedeza (TWRA 2018). Reclaimed fields were often planted with cool season vegetation such as wheat (Triticum spp.), clovers (Trifolium spp.), turnips (Brassica spp.), and alfalfa (Medicago spp.; TWRA 2018). Annual warm season vegetation planted included soybeans (Glycine spp.), cowpeas (Vigna spp.), sunflowers (Helianthus spp.), and corn (Zea L. spp.; TWRA 2018). To ensure prime elk foraging, TWRA uses prescribed burning, herbicide treatment, mowing, and replanting on 2–3-year and 3–5-year cycles (TWRA 2018).

Fecal Collection

We collected feces from elk collared during a companion study and non-invasively collected pellets deposited in collection areas in 7 different zones on the NCWMA. We selected 65 openings for sampling based on their history of elk use and geographical representation of the majority of the NCWMA (Figure 1). Fecal collection sites varied in size, vegetative make-up, and elevation. We collected feces every week during the late winter and early spring months of 2019 (February through April) as this time period corresponded to an ongoing elk population genetics study that required fresh, well-preserved feces often associated with winter. We collected between 10–15 pellets from fresh fecal groups. We determined fecal freshness on factors such as color, moisture, smell, and luster (Kirchhoff and Larsen 1998, Murrow 2007, Lupardus et al. 2011). We only collected feces during dry periods, as precipitation has been shown to destroy genetic material in feces (Brinkman et al. 2010). After a rain event, we allotted a waiting period of at least 1 day before further collection to allow the elk to re-enter, feed from, and defecate in collection areas. We completed transects of up to 3.4 km (measured with GPS unit as 2.1 miles) using all-terrain vehicles (ATV) to move through fecal collection openings, and collected suitable samples found within 1 meter in any direction of the ATV. Using a new pair of latex gloves for every sample,, we picked up pellets using an inside-out labeled plastic bag and recorded a GPS point at every suitable sample using a Garmin eTrex 20x unit (Garmin Ltd., Kansas, USA). We recorded the general weather  conditions the night before collection as well as the temperature range, relative humidity, and dew point of the day of collection. We also recorded a description of the feces and its surroundings. We immediately placed the collected pellets in a cooler with plastic ice packs for transportation back to the laboratory where samples were placed in a -20°C freezer to preserve the integrity of the DNA in the pellets until the time of genetic analysis. The number of samples processed throughout the sequential steps of the workflow are listed in Table S1.

Feces were also collected during live elk captures that took place during the winters of 2019 and 2020, as described by Watson (2022). All research was approved by the University of Tennessee Institutional Animal Care and Use Committee (UT-IACUC 2671-0219). Throughout the study period we captured 29 elk (21 females and 8 males) at the NCWMA using corral traps and free darting with chemical immobilization. Elk were anesthetized with an intramuscular injection of BAM (Each 1 mL of BAM is comprised of 27.3 mg of butorphanol, 9.1 mg of azaperone, and 10.9 mg of medetomidine; ZooPharm, Windsor, Colorado, USA) administered by darts (PneuDart Inc., Williamsport, Pennsylvania) and a Dan-Inject CO2 Rifle (Dan-Inject, Kolding, Denmark). Elk were administered GPS collars and blood, hair, and feces were collected from captured elk. We reversed immobilization with an additional intramuscular injection of atipamezole (25 mg/mL; ZooPharm) and 0.5 mL of naltrexone (50 mg/mL; ZooPharm) and all elk were monitored post-reversal until they left the area.

Elk Genetics Microsatellite Analysis

For every sample we used a new flat toothpick to rim the outside of the pellet for elk DNA during a brief thaw from the freezer. We placed toothpicks in coin envelopes and sent samples to Wildlife Genetics International (Nelson, British Columbia, Canada), where they performed a DNA analysis of 16 microsatellites and an amelogenin sex marker used in previous elk work to confirm sex and genetic cluster (Muller et al. 2018, Watson 2022).The elk genetic samples were put through extractions and DNA purification following the methods used in Wood et al. (1999) and Paetkau (2003). PCR and PCR product visualization were completed using the methods used by Paetkau (1998).. Following lab protocols, results were put through the software programs CERVUS (version 3.0.3; Kalinowski et al. 2007) to evaluate heterozygosity and STRUCTURE (version 2.3.4; Pritchard et al. 2000, 2010) to determine if elk sampled originated from distinct populations (> 80% assignment to EINP-North or EINP-South populations). The individual methods and specifications used for these two programs are described in Muller et al. 2018.

Fecal Metabarcoding Analysis

The techniques we used including fecal pellet preparation, DNA extraction, two-step PCR, and final sequencing were chosen under the advisement of or performed by the University of Tennessee Genomics Core at 2641 Osprey Vista Way, Knoxville, TN, who provide high-throughput next-generation sequencing services and training. We dried feces by placing 5 frozen fecal pellets per sample for each individual elk into a 50 mL tube with silica gel beads. Upon placement into the silica, the tube was returned to the freezer until the sample was completely dry. We cut 4 of the 5 dried fecal pellets from each sample for extraction, the fifth pellet was returned to the freezer for any necessary future re-extractions. We used high precision single edge carbon steel razor blades to cut into the pellets, exposing the inner portion of the feces. The inner portion of the fecal pellet was excised weighed and used for analysis. The starting weight of each dried sample for DNA extractions was between 0.15–0.18 g. We completed DNA extractions using the Qiagen QIAmp PowerFecal DNA Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol. We stored final products in a freezer at -20°C. For the first 61 samples, each pellet from the group of 4 was treated as an individual sample throughout the entire protocol. However, after the final elution step, we pooled 25 mL of final product from each of the 4 pellets from one group into 1 tube for sequencing. For the remaining 296 collection samples, approximately 0.04 g of dried material was taken from each of the 4 pellets individually and placed into one tube to be treated as a single sample during the extraction process. The 21 samples obtained from  live elk capture that we extracted were put through the latter protocol of combining a smaller amount of material from each individual pellet to represent the singular sample, totaling 0.15–0.20 g of feces per sample at the start of the extraction process. We put extraction samples through an initial PCR following Illumina’s 16S Metagenomic Sequencing Library Preparation protocol, Part # 15044223 Rev. B [hereto after cited as (Illumina)]. In the initial PCR, we used Kapa Hifi master mix taq (Roche) with the forward primer, ITS2-2For (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG; Chen et al. 2010), and the reverse primer, ITS2-3Rev (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG; Chiou et al. 2007), to amplify the approximately 475 bases of the ITS2 region of the nuclear ribosomal DNA given its proven plant identification efficacy Illumina-specific adapters were added to both the forward and reverse primers [(ITS2-2For: -ATGCGATACTTGGTGTGAAT) (ITS2-3Rev: -GACGCTTCTCCAGACTACAAT): (Illumina)]. Initial primers were purchased with the additional bases necessary for the Illumina two-step PCR process. The initial PCR consisted of 2 × KAPA HiFi HotStart ReadyMix Taq (Roche, Indianapolis, Indiana, United States) and 7.5 uM each primer. The reaction consisted of 3 min at 95°C, followed by 35 cycles of 95°C for 30 s, 62.5°C for 30 s, and 72°C for 30 s, with a final extension at 72°C for 5 min. After amplification was verified using a 2% agarose gel, we purified the PCR product using Agencourt Ampure XP beads (Beckman Coulter) to eliminate any remaining primer and primer-dimers. Following bead clean-up, we completed an index PCR using Nextera XT Version 2 indexes (Illumina), As there were too many samples to specify individual primer tags, we used the Nextera indexes. The conditions for the index PCR were the same as the initial PCR, except with 8 cycles instead of 35. This step was again followed by purification with Agencourt Ampure XP beads. We measured the 260/280 and 260/230 ratios and concentrations of amplified DNA (ng/µL) in samples using a Nanodrop spectrophotometer. We then pooled an approximately equal molar amount of each sample and normalized the samples before visualizing the products for a final quality and quantity check on an Agilent Bioanalzyer to confirm the expected product size of approximately 475 bases and to confirm the final molarity of the pooled samples. We diluted the final amplicon product to a final concentration of 4 pM following Illumina’s specifications, combined the sample with 20% PhiX control (Illumina), and loaded it onto a Version 3 flow cell reading 275 bases paired-end on the Illumina MiSeq at the University of Tennessee’s Genomics Core Facility. The specifications of the flow cell were selected to obtain sequences long enough for bioinformatical overlap while still maintaining a high quality read.

To determine plants identified from sequencing, we sent our sequence data result files from the Illumina MiSeq read to the MRDNA Molecular Research lab (Shallowater, Texas, USA) who performed bioinformatics using a custom pipeline. The MRDNA group has provided bioinformatic and DNA metabarcoding support and services in similar studies (Kerr et al. 2013, Noriega et al. 2016, Nelms et al. 2018, Dixon, 2021, Olin et al. 2023). Primer sequences and those sequences with a length of <150 bp (base pairs) were removed. Sequences of this length were defined as “low-read” (this number was calculated by multiplying our average sequence read per sample (~15,000 bp) by 0.01, defining 1% as 150 bp). Remaining sequences were quality filtered with a maximum expected error threshold of 1.0 and dereplicated. Though the expected number of errors for a filtered read is zero, errors may occur. Thus, a maximum error threshold of 1.0 was selected to account for these errors. During dereplication unique sequences were identified and only one copy of every sequence was included in the taxonomic identification analysis. Following dereplication, the denoising algorithm, UNOISE2, was implemented to identify probable sequencing errors, and remove unique sequences, PCR errors and chimeras which produced denoised sequences referred to as zero-radius operational taxonomic units (zOTUs). To classify these final zOTUs taxonomically, BLASTn was used against a curated sequence database derived from NCBI (NCBI Resource Coordinators 2018). BLASTn was implemented with default settings and the curated database was created by the MRDNA group. Final taxonomically-identified zOTUs underwent proportional calculations to determine the abundance of plants identified from each individual sample.

Statistical Analyses

The MRDNA group performed accompanying statistical analyses using XLstat, NCSS 2007, “R”, and NCSS 2010 (Addinsoft 2019, Hintze 2007, R Core Team 2017). Analyses performed by the MRDNA group were conducted on the four assigned sex-genetic groups (F-EINPN, F-EINPS, M-EINPN, and M-EINPS). For analyses we focus on descriptive analyses due to large variation in sample size among the number of individuals successfully assigned to each of our four sex-genetic groups (F-EINPN n=14; F-EINPS n=44; M-EINPN n=4; M-EINPS n=14). To determine if specific plant genera used by elk differed among sex-genetic groups, comparisons were made using an ANOVA and post hoc pairwise comparison using Tukey’s test for rarified genera data whose relative abundance was > 0.1000% (n= 139 plant genera). For these analyses, individual genera acte as the dependent varibalbe and sex-genetic group as the independent variable.  To compare richness and evenness of plant genera found in each sample, alpha diversity was estimated for each sex-genetic group and compared against each other. The alpha diversity analysis used a Shannon Diversity index analysis with zOTUs using Kruskal-Wallis pairwise comparisons which assesses not only richness, but also the evenness of the genera for the sample group from zOTUs. For this study the analysis was conducted for each sex-genetic sample group; the pairwise comparison evaluates the diversity measurement of one sample group to another, producing an H test statistic and a p-value. A high H test statistic and p- <0.0125 (Bonferroni p-value calculated determined by dividing 0.05/4 = 0.0125 (number of sex-genetic groups)) indicates that the alpha diversity of the genera found in the samples of the two groups being compared are significantly different. The alpha diversity analysis was conducted as described in previous studies (Dowd et al. 2008a, Dowd et al. 2008b, Edgar 2010, Eren et al. 2011, Swanson et al. 2011) using Qiime 2 (Bolyen et al. 2018) wherein samples were rarefied to 1,000 sequences and significance was assigned for those analyses with a p-value less than 0.0125. To determine community diversity of plant genera between sex-genetic groups, beta diversity was measured using weighted UniFrac distance matrices. Weighted UniFrac analyses sum the phylogenetic branch lengths of sequences from the studied communities and account for abundance of OTUs (Chang et al. 2011). From these matrices, a principal coordinate analysis (PCoA) plot was utilized to visualize the data, and a pairwise analysis of similarities (ANOSIM) was used to detect community plant genera differences. The ANOSIM calculates the ratio statistic “R”; an R calculation close to 1.0 implies that the groups being compared are dissimilar, while an R value closer to 0.0 indicates a similar diversity of samples between the compared groups.

To assess forage class consumption by elk, we performed statistical analyses using XLStat (Addinsoft 2019). Similar to Lupardus et al. (2011), we classified sequences identified at the genus-level within the Plantae kingdom into 1 of 4 forage categories: woody plant, graminoid, forb, and legume. We then calculated the proportion of all plant sequences found for each forage class. To determine if forage class consumption by the sampled elk changed through the sampling period, we calculated the proportions of these forage classes before and after spring green-up (SGU) for all samples put through sequencing and bioinformatics collected during the field season. Based on field observations of increased forb and grass growth and the emergence of spring blooms on trees we determined SGU to occur in mid-March, and thus classified samples as before-SGU if they were collected between the start of sampling (2 February 2019) up until 14 March 2019. We classified all samples as after-SGU if they were collected between 18 March 2019 to the end of the sampling period (25 April 2019). The proportion of each genus was calculated by adding all sequences found for that genus from all samples and dividing it by the total number of samples. To investigate whether forage class consumption differed before and after SGU, we performed a Fisher’s exact test (Proc FREQ; SAS Institute Inc. 2018). This analysis was completed on the proportions of the forage classes for all sampled elk when sequence detection was >1.000% (sequences for the genus represented more than 1.000% of all sequences found following abundance calculations). We categorized genera both before and after SGU (forb, woody plant, graminoid, and legume) and compared them against the total proportion of all forage classes combined. The p-values calculated for the forage classes were two-sided and compared against a Bonferroni corrected p-value. The Bonferroni corrected p-value was calculated by dividing 0.05 by the number of categories and this adjusted value was used for significance comparison in all 4 analyses (Bonferroni-corrected P=0.05/4=0.0125). We used the Shannon-Weiner Function (Krebs 1989) to evaluate diversity of plant genera before and after SGU. Using

H = (sum of i=1 to g)(pi)(ln[pi])

Where H can be 0 if only 1 genus is collected. The value for g is the total number of detected genera and pi is the proportion of genera within a sample which is calculated based on the proportion for each genus.


Tennessee Wildlife Resources Agency

USDA National Institute of Food and Agriculture, Award: TEN00MS-113

University of Tennessee SNR, School of Natural Resources