Data from: Ghost species form an important component of the epiphytic biota in temperate forests
Data files
Sep 27, 2024 version files 22.76 GB
-
I2_A_R1.fastq.gz
1.81 GB
-
I2_A_R2.fastq.gz
1.83 GB
-
I2_B_R1.fastq.gz
2.44 GB
-
I2_B_R2.fastq.gz
2.42 GB
-
I2_C_R1.fastq.gz
877.06 MB
-
I2_C_R2.fastq.gz
1 GB
-
I2_demultiplexing_file.txt
3.16 KB
-
I3_A_demultiplexing_file.txt
1.20 KB
-
I3_A_R1.fastq
1.78 GB
-
I3_A_R2.fastq
1.78 GB
-
I3_B_demultiplexing_file.txt
1.41 KB
-
I3_B_R1.fastq
2.15 GB
-
I3_B_R2.fastq
2.15 GB
-
I3_C_demultiplexing_file.txt
1.32 KB
-
I3_C_R1.fastq
2.25 GB
-
I3_C_R2.fastq
2.25 GB
-
List_of_samples.txt
5.45 KB
-
README.md
3.85 KB
-
SI_Data.xlsx
2.31 MB
Abstract
Sequencing of environmental samples has great potential for biodiversity research, but its application is limited by the lack of reliable DNA barcode databases for species identifications. Such a database has been created for epiphytic lichens of Europe, allowing us to compare the results of environmental sequencing with standard taxonomic surveys. The species undetected by taxonomic surveys (what we term the ghost component) amount to about half of the species actually present in hectare plots of Central European forests. Some of these, which currently occur only as diaspores or weakly developed thalli, are likely to be favoured in the course of global change. The ghost component usually represents a larger fraction in managed forests than in old growth unmanaged forests. The total species composition of different plots is much more similar than suggested by taxonomic surveys alone. On a regional scale, this supports the well-known statement that "everything is everywhere, but, the environment selects".
README: Ghost species form an important component of the epiphytic biota in temperate forests
https://doi.org/10.5061/dryad.fqz612k0g
The dataset contains Illumina amplicon sequencing of three barcode loci: nuclear ribosomal region ITS1, nuclear ribosomal region ITS2, and mitochondrial ribosomal SSU. The three barcoding loci were PCR-amplified from environmental samples of epiphytic lichen biomass. Sample pools of individual barcoding loci were sent for Illumina library preparation and sequencing (250 bp paired-end, two independent runs). The resulting reads were BLAST identified with the sequences of lichen taxa from Martin7 database (Vondrák J. et al., 2023: Martin7: a reference database of DNA barcodes for European epiphytic lichens and its taxonomic implications. Preslia 95: 311-345.)
Description of the data and file structure
The two independent Illumina sequencing runs (I2 and I3) yielded following fastq data:
I2_A: ITS2 barcode amplified with primers 5.8S-Fun and ITS4-Fun
I2_B: ITS1 barcode amplified with primers ITS1F and ITS2
I2_C: mtSSU barcode amplified with primers mrSSU1 and mrSSU3R
I3_A: ITS2 barcode amplified with primers 5.8S-Fun and ITS4-Fun
I3_B: ITS1 barcode amplified with primers ITS1F and ITS2
I3_C: mtSSU barcode amplified with primers mrSSU1 and mrSSU3R
The sequencing reads start with tag sequence followed by specific primer sequence, and are not merged. The "R1" or "R2" in fastq file names indicate forward and reverse reads, respectively. List of sequenced samples is given in List_of_samples.txt. Unique dual tagging was used to disginguish individual samples, i.e. the same unique tag was added at 5´ end of both specific primers. List of samples involved in individual fastq files with corresponding tag sequences is given in following demultiplexing files:
I2_demultiplexing_file.txt (universal for I2_A, I2_B, I2_C fastq files; 142 samples sequenced for all the three barcodes; the same tag was used for all the three barcodes)
I3_A_demultiplexing_file.txt (for I3_A fastq files)
I3_B_demultiplexing_file.txt (for I3_B fastq files)
I3_C_demultiplexing_file.txt (for I3_C fastq files)
The I3 data involved sequencing of two additional samples (PO1/5 and BO1/5) for all the three barcodes, together with resequencing of selected samples with insufficient sequence coverage in I2 data.
Specific primer sequences:
5.8S-Fun AACTTTYRRCAAYGGATCWCT
ITS4-Fun AGCCTCCGCTTATTGATATGCTTAART
ITS1F CTTGGTCATTTAGAGGAAGTAA
ITS2 GCTGCGTTCTTCATCGATGC
mrSSU1 AGCAGTGAGGAATATTGGTC
mrSSU3R ATGTGGCACGTCTATAGCCC
The SI Data.xlsx is the final output of the analysis and contains list of lichen taxa detected by blast identification of Illumina reads and taxonomic survey, respectively:
Analysis of ITS1 and ITS2 barcodes (columns C-LH): Orange columns summarize data from all environmental samples of given plot and barcode. Red columns summarize data from all plots of given barcode. Yellow columns summarize data from all environmental samples of given plot using merged data of ITS1 and ITS2 barcodes.
Analysis of mtSSU barcode (columns LI-RA): Orange columns summarize data from all environmental samples of given plot. Red column summarizes data from all plots.
Analysis using merged data of all three barcodes (ITS1, ITS2, and mtSSU; columns RB-WU): Red columns summarize data from all trees of given plot. Blue column summarizes data from all sampled trees. Green columns summarize data for individual plots.
Analysis of taxonomic survey (columns WV-YO): Blue columns contain list of taxa. Grey columns summarize abundance of individual taxa in given plots or trees, respectively. Yellow columns summarize abundance of individual taxa in all trees of given plot. Orange column summarizes abundance of individual taxa from all sampled trees.