Data from: Impacts of proactive health management on cattle and horse diets and dung biodiversity in Danish rewilding areas
Data files
May 17, 2025 version files 291.64 GB
-
README.md
8.93 KB
-
RWDK_COI_classified_corrected.txt
16.10 MB
-
RWDK_COI_classified.txt
14.57 MB
-
RWDK_COI_DADA2_nochim.otus
216.61 MB
-
RWDK_COI_DADA2_nochim.table
3.78 GB
-
RWDK_COI_example_batchfileDADA2.list
64 B
-
RWDK_COI_motus_RWEU_campaign.csv
29.29 KB
-
RWDK_COI_raw_data.tar.gz
140.45 GB
-
RWDK_COI_read_table_RWEU_campaign.csv
11.46 KB
-
RWDK_COI_samples_RWEU_campaign.csv
2.82 KB
-
RWDK_ITS_classified_corrected.txt
106.48 MB
-
RWDK_ITS_classified.txt
103.72 MB
-
RWDK_ITS_DADA2_nochim.otus
40.89 MB
-
RWDK_ITS_DADA2_nochim.table
322.99 MB
-
RWDK_ITS_example_batchfileDADA2.list
64 B
-
RWDK_ITS_motus_RWEU_campaign.csv
18.03 KB
-
RWDK_ITS_raw_data.tar.gz
146.58 GB
-
RWDK_ITS_read_table_RWEU_campaign.csv
7.34 KB
-
RWDK_ITS_samples_RWEU_campaign.csv
3.79 KB
-
tagfiles.tar.gz
9.84 KB
Abstract
Reintroducing megafauna to reinstate missing top-down trophic interactions (trophic rewilding) is increasingly being applied as a tool to promote self-regulating, biodiverse ecosystems. Even though the theoretical background is clear, and megafauna effects are documented from prehistoric ecosystems, the effects of reintroduced herbivores in contemporary ecosystems remain understudied. This includes how reintroduced megafauna interact with each other and the ecosystem, but also how current management practices affect the processes they provide. In this study, we investigated the effects of proactive health management, i.e., winter feeding and anti-parasitic treatments, on the ecosystem by examining diets of large herbivores and dung-associated invertebrate communities. We used environmental DNA metabarcoding to yield community compositions of plants and invertebrates in dung from cattle and horses from five comparable nature sites in Denmark, which differed in management, and site/population-specific properties such as availability of woody plant species, herbivore densities, and provision of winter feeding and anti-parasitic treatments. We found different diet compositions between cattle and horses, highlighting their functional differences. For example, horse samples had higher relative read abundances of graminoid and tree DNA. Supplementary feeding affected diets, by decreasing consumption of graminoids and tree species relative to forbs and legumes, probably originating from fodder, and intense feeding seemed to almost eliminate consumption of local vegetation. However, more studies are needed to generalize these findings. Several invertebrate families were associated with either cattle or horse dung, suggesting complementary effects on dung-associated invertebrate biodiversity by these large grazers. The taxa that responded negatively to anti-parasitic treatments were mainly parasitic nematodes (e.g., the families Ancylostomatidae, Cooperidae, and Strongylidae), suggesting that the applied treatments work as intended, but these results should be interpreted with caution due to methodological limitations.
Synthesis and application. Our findings demonstrate functional differences between cattle and horses, which suggest complementary effects on vegetation development and consequently biodiversity. Also, our results indicate that this functionality is impacted by proactive health management actions. We suggest that potential effects on herbivory and biodiversity are carefully considered before supplementary feeding or anti-veterinary treatments are provided in year-round grazing systems and avoided if possible.
Content of this data repository:
- A compressed repository (.tar.gz) is uploaded for each dataset (ITS & COI). Each repository includes a directory for each library containing two raw sequencing output files (end in .fq.gz).
- A compressed repository (tagfiles.tar.gz) including all the files for demultiplexing (one for each library, ends in tags.txt).
- An example file for each dataset to use in MetaBarFlow (example_batchfileDADA2.list)
- The following output files from MetaBarFlow for each dataset (file names start with RWDK_COI_ and RWDK_ITS_*, respectively):
- *classified.txt (the original taxonomic classification file of ASVs from MetaBarFlow)
- *classified_corrected.txt (the manually corrected taxonomy file)
- *DADA2_nochim.otus (List of all ASVs defines by MetaBarFlow)
- *DADA2_nochim.table (ASV/sample read count matrix
- The files containing data from the added samples from another sampling campaign, including two of the sites, were collected in August 2022. There are three files for each dataset: one with the samples (samples_RWEU_campaign.csv), one with the motus (motus_RWEU_campaign.csv), and one with the read count table (*read_table_RWEU_campaign.csv), which can be added to the remaining data following the scripts in https://github.com/emilthomassen/RWDK_public.
All files begin with the prefix “RWDK” which is a project ID, followed by either “ITS” or “COI” corresponding to the two different datasets. The COI dataset consists of amplicon reads generated by the BF1/BR1 COI primers (Elbrecht & Leese, 2017, https://doi.org/10.3389/fenvs.2017.00011), and the ITS dataset consists of amplicon reads generated by the ITS2-S2F/ITS4 primers (Fahner et al., 2016, https://doi.org/10.1371/journal.pone.0157505).
Bioinformatic pipeline:
Start by uncompressing the .tar archives containing the tag files and the raw sequencing data for the ITS and the COI datasets.
tar -xzf tagfiles.tar.gz
tar -xzf RWDK_COI.tar.gz
tar -xzf RWDK_ITS.tar.gz
Setting up the directory structure on a HPC cluster
mkdir root_dir
cd root_dir
mkdir -p COI/backup/data/raw_data
mkdir -p COI/tmp
mkdir -p COI/results
mkdir -p ITS/backup/data/raw_data
mkdir -p ITS/tmp
mkdir -p ITS/results
Place all downloaded files in the root directory (root_dir)
Set up directories for each library
cd COI/backup/data/raw_data
mkdir L11 L12 L13 L14 L21 L22 L23 L24 L31 L32 L33 L34 L41 L42 L43 L44 L51 L52 L53 L54 L61 L62 L63 L64 L71 L72 L73 L74
cd ../../../../ITS/backup/data/raw_data
mkdir L081 L082 L083 L084 L091 L092A L092B L093 L094 L101 L102 L103 L104 L111 L112 L113 L114 L121 L122 L123 L124 L131 L132 L133 L134 L141 L142 L143 L144
cd ../../../..
Putting the files in the right places
mv RWDK_11* COI/backup/data/raw_data/L11/.
mv RWDK_12* COI/backup/data/raw_data/L12/.
mv RWDK_13* COI/backup/data/raw_data/L13/.
mv RWDK_14* COI/backup/data/raw_data/L14/.
mv RWDK_21* COI/backup/data/raw_data/L21/.
mv RWDK_22* COI/backup/data/raw_data/L22/.
mv RWDK_23* COI/backup/data/raw_data/L23/.
mv RWDK_24* COI/backup/data/raw_data/L24/.
mv RWDK_31* COI/backup/data/raw_data/L31/.
mv RWDK_32* COI/backup/data/raw_data/L32/.
mv RWDK_33* COI/backup/data/raw_data/L33/.
mv RWDK_34* COI/backup/data/raw_data/L34/.
mv RWDK_41* COI/backup/data/raw_data/L41/.
mv RWDK_42* COI/backup/data/raw_data/L42/.
mv RWDK_43* COI/backup/data/raw_data/L43/.
mv RWDK_44* COI/backup/data/raw_data/L44/.
mv RWDK_51* COI/backup/data/raw_data/L51/.
mv RWDK_52* COI/backup/data/raw_data/L52/.
mv RWDK_53* COI/backup/data/raw_data/L53/.
mv RWDK_54* COI/backup/data/raw_data/L54/.
mv RWDK_61* COI/backup/data/raw_data/L61/.
mv RWDK_62* COI/backup/data/raw_data/L62/.
mv RWDK_63* COI/backup/data/raw_data/L63/.
mv RWDK_64* COI/backup/data/raw_data/L64/.
mv RWDK_71* COI/backup/data/raw_data/L71/.
mv RWDK_72* COI/backup/data/raw_data/L72/.
mv RWDK_73* COI/backup/data/raw_data/L73/.
mv RWDK_74* COI/backup/data/raw_data/L74/.
mv RWDK_81* ITS/backup/data/raw_data/L081/.
mv RWDK_82* ITS/backup/data/raw_data/L082/.
mv RWDK_83* ITS/backup/data/raw_data/L083/.
mv RWDK_84* ITS/backup/data/raw_data/L084/.
mv RWDK_91* ITS/backup/data/raw_data/L091/.
mv RWDK_92* ITS/backup/data/raw_data/L092A/.
mv RWDK_93* ITS/backup/data/raw_data/L093/.
mv RWDK_94* ITS/backup/data/raw_data/L094/.
mv RWDK_101* ITS/backup/data/raw_data/L101/.
mv RWDK_102* ITS/backup/data/raw_data/L102/.
mv RWDK_103* ITS/backup/data/raw_data/L103/.
mv RWDK_104* ITS/backup/data/raw_data/L104/.
mv RWDK_111* ITS/backup/data/raw_data/L111/.
mv RWDK_112* ITS/backup/data/raw_data/L112/.
mv RWDK_113* ITS/backup/data/raw_data/L113/.
mv RWDK_114* ITS/backup/data/raw_data/L114/.
mv RWDK_121* ITS/backup/data/raw_data/L121/.
mv RWDK_122* ITS/backup/data/raw_data/L122/.
mv RWDK_123* ITS/backup/data/raw_data/L123/.
mv RWDK_124* ITS/backup/data/raw_data/L124/.
mv RWDK_131* ITS/backup/data/raw_data/L131/.
mv RWDK_132* ITS/backup/data/raw_data/L132/.
mv RWDK_133* ITS/backup/data/raw_data/L133/.
mv RWDK_134* ITS/backup/data/raw_data/L134/.
mv RWDK_141* ITS/backup/data/raw_data/L141/.
mv RWDK_142* ITS/backup/data/raw_data/L142/.
mv RWDK_143* ITS/backup/data/raw_data/L143/.
mv RWDK_144* ITS/backup/data/raw_data/L144/.
Move data for pools where additional sequencing were performed to a separate directory (L092)
cd ITS/backup/data/raw_data/L092A
mv HFVJCDRX3 ../L092B/.
cp RWDK_9_2_tags.txt ../L092B/.
cd ../../../../..
Unzip the “fq.gz” files within each library (example run):
cd COI/backup/data/raw_data/L11/
gunzip *.gz
cd ../../../../..
-
Create file named “batchfileDADA2.list” in each sequencing directory
-
Replace “File1.fq” & “File2.fq” in “RWDK_COI_example_batchfileDADA2.list” and “RWDK_ITS_example_batchfileDADA2.list” with the names of the raw data files (forward and reverse) for each library:
Example:
-
RWDK_1_1_FKDN230260638-1A_HW7HWDSX5_L1_1.fq RWDK_1_1_FKDN230260638-1A_HW7HWDSX5_L1_2.fq ACWGGWTGRACWGTNTAYCC ARYATDGTRATDGCHCCDGC 100
-
Make sure that the file is with UNIX linebreaks, and the line as examplified is followed by an empty line
-
Rename all batchfiles and tag files in each library (example run):
mv COI/backup/data/raw_data/L11/RWDK_1_1_tags.txt COI/backup/data/raw_data/L11/tags.txt
mv COI/backup/data/raw_data/L11/RWDK_COI_example_batchfileDADA2.list COI/backup/data/raw_data/L11/batchfileDADA2.list
-
Continue for all libraries (L12, L13, a.s.o. by changing paths and filenames in the lines above accordingly - remember the directories for additional sequencing)
-
Then do the same for the ITS directory (L082, L083 a.s.o.)
-
You should now have two directories called COI and ITS in the root directory, each containing a directory called backup/data/raw_data with 28 (29 in the ITS directory) subdirectories corresponding to the sequencing libraries for each dataset - each directory should include two raw data files (forward/reverse reads), a file for demultiplexing (tags.txt) and a file which specifies input files, primer sequences and minimum length of reads to use
-
Now the raw sequencing data, batchfiles and tag files are in the right places for running the MetaBarFlow pipeline found at (https://github.com/evaegelyng/MetaBarFlow)
-
Exact scripts for use in the metabarflow pipeline can be found here: https://doi:10.5281/zenodo.15296452
Files generated by MetaBarFlow
- The remaining files in the root_dir folder are the files generated in the MetaBarFlow pipeline, which can be imported into R, where any additional analyses can be done - there is one of each file for each dataset (COI/ITS)
*classified.txt - The list of taxonomic assignments of each of the ASVs found in the corresponding dataset (note that this file has not undergone manual taxonomic edits and has not been through subsequent filtering steps listed in the manuscript)
(see https://github.com/evaegelyng/MetaBarFlow for further description)
*classified_corrected.txt - The manually corrected version of the classified list
*DADA2_nochim.otus - The list of ASVs found across all samples, which survived DADA2 and chimera filtering
*DADA2_nochim.table - The overview of which ASVs were found in which samples
Also, there is three files with the additional data collected in another sampling campaign for each dataset (COI/ITS) including the read count table (read_table_RWEU_campaign.csv), the samples table (samples_RWEU_campaign.csv) and the motu table (*motus_RWEU_campaign.csv).
The dataset consists of DNA reads from high-throughput sequencing of 295 dung samples from cattle and horses collected at five sites in Denmark in 2022. From each of the five sites, 7 samples from cattle and 7 samples from horses (except the NM site where no horses were present) were collected in February, March, April, Jun,e and August, respectively. At each sampling event, a field blank was collected and sequenced alongside the other samples (see associated manuscript for details). Twenty-two of these samples (10 from the ML and SL sites respectively, and two field blanks collected in August) were sequenced alongside (and will also be used in) another project, and was thus not part of this raw data, but the filtered data from these samples are added as separate files (see description further below) and should be appended to the main dataset for the subsequent analysis.
DNA was extracted from the samples with the Fast DNA Stool Mini Kit from Qiagen, and amplified by PCR reactions with two primer sets (see PCR reagents and thermal settings, etc. in the associated manuscript). One with the BF-1/BR-1 primers (Elbrecht & Leese, 2017, https://doi.org/10.3389/fenvs.2017.00011), targeting a 217 bp fragment of COI optimized for invertebrates, and one with the ITS2-S2F/ITS4 primers (Fahner et al., 2016, https://doi.org/10.1371/journal.pone.0157505), targeting the nuclear ITS region, and optimized for plants.
During the laboratory pipeline, 32 extraction blanks (sample names including CNE) and four PCR blanks for each amplicon pool (sample names including NTC, 28 in total) were included, which were sequenced alongside the rest of the samples. Hence, in total, the uploaded raw sequencing data includes DNA reads obtained from 373 samples, which were separated into 7 batches. Each batch was used as a template for both primer sets and run through 4 replicate PCR reactions: L11-L74 for COI, and L081-L144 for ITS. For 1 library (L092), additional sequencing was performed, and thus, two separate raw data files exist from this library. See README.md for a description of how to treat these in the bioinformatic pipeline.
The raw sequencing data were run through the MetaBarFlow pipeline (https://github.com/evaegelyng/MetaBarFlow), with parameters following Thomassen et al. (2024) (https://doi.org/10.1111/mec.16847), and the exact scripts are located here: (https://doi:10.5281/zenodo.15296452). The pipeline produces an ASV list (*DADA2_nochim.otus), and a matrix with read counts of each ASV in each sample (*DADA2_nochim.table), as well as a list with taxonomic assignment of all ASVs (*classified.txt) for each data set. The taxonomic identification of DNA sequencing reads for the ITS dataset were made by blasting (blastn) against the complete NCBI genbank nt database (https://www.ncbi.nlm.nih.gov/) downloaded locally, and for the COI dataset, blasting was performed against a custom build COI database containing all COI sequences from BOLD (www.boldsystems.org) and NCBI Genbank (https://www.ncbi.nlm.nih.gov/) See Klepke et al. (2022) (https://doi.org/10.1002/edn3.340) for further description of how the database was built, and the associated publication for details of Blast parameters.
For final taxonomic assignment (score_ID column in "*classified.txt") was defined as the last common ancestor of all blast hits within the range of sequence similarity of hits to the best match, including hits within a 2% margin of the best ID, and species ID was only assigned if the best match was >98% similar.
The list of taxonomic assignments was manually checked for errors resulting from spurious reference database sequences or similar, and when such errors were spotted, the taxonomic assignment of the given ASV was corrected manually. Also, for COI, ASVs identified at higher levels than species were assigned to "putative species", which were units including the same possible IDs. For ITS, aggregations were made at the genus level. These final manually edited IDs are found in the "final_ID" columns in the files "*classified_corrected.txt".
See the manuscript and associated GitHub (https://github.com/emilthomassen/RWDK_public) for further details about subsequent analysis.