Files: sequences_dataset1_embl_identification.txt sequences_dataset1_afro-alpine_identification.txt sequences_dataset2_embl_identification.txt sequences_dataset2_afro-alpine_identification.txt These files contain the unique sequences with their taxonomic identification inferred using the program ecotag (OBITools http://www.grenoble.prabi.fr/trac/OBITools). Sequences were produced by amplification of DNA preserved in sediments from the crater swamp and crater lake on Mt. Gahinga and Mt. Muhavura, respectively, on the Virunga Volcanoes of the Albertine Rift, eastern Africa. DNA was amplified using the trnL-g and trnL-h primers to obtain sequences of the P6 loop of the tnrL, and amplicons were sequenced using Roche 454 GS FLX Titanium. All sequences output by the program ecotag are presented. Only sequences identified to at least family level with an identity > 0.95, and which were not identified as possible contaminants, were kept in the final dataset for further analyses. Taxonomic identification was inferred using the Afro-alpine taxonomic reference library v.1 and a reference database formatted from embl. Details on sample collection, DNA extraction, PCR conditions, and tools and settings used for filtering and taxonomic identification of the raw sequence data can be found in the associated publication. Contact author: Sanne Boessenkool (sanneboessenkool@gmail.com) Column headings: order_name: name of the identified order order_taxid: taxonomic identification of the identified order family_name: name of the identified family family_taxid: taxonomic identification of the identified family genus_name: name of the identified genus genus_taxid: taxonomic identification of the identified genus species_name: name of the identified species species_taxid: taxonomic identification of the identified species scientific_name: final identification based on GenBank (can be an order, family,tribe, genus, species, etc.) best_identity: best match with the closest sequence in the reference database (GenBank) taxid: taxonomic identification of the scientific name rank: level of identification (order, family, etc.) count: total number of reads of the sequence sequence: DNA sequence sample:xxxxx_S1_gh: number of reads of the relevant sequence in extraction xxxxx Missing data: NA Sample information: Extraction_number Sample_name Location Sample_age SL001 GM_MUH4_10-11cm Mt. Muhavura 1980 AD SL002 GM_MUH4_18-19cm Mt. Muhavura 1960 AD SL003 GM_MUH4_26-27cm Mt. Muhavura 1940 AD SL004 GM_MUH2-1_75-79cm Mt. Muhavura 1190 AD SL005 GM_MUH2-1_130-134cm Mt. Muhavura 300 AD SL006 GM_MUH2-2_160-163cm Mt. Muhavura 510 AD SL007 GM_MUH2-3_220-223cm Mt. Muhavura 170 BC SL008 GM_MUH2-4_245-248cm Mt. Muhavura 350 BC SL009 GM_GAH1_11-15cm Mt. Gahinga 1920 AD SL010 GM_GAH1_35-38cm Mt. Gahinga 1810 AD SL011 GM_GAH2_110-113cm Mt. Gahinga 1700 AD SL012 GM_GAH2_158-161cm Mt. Gahinga 1480 AD SL013 GM_GAH3_210-212cm Mt. Gahinga 1250 AD SL014 GM_GAH3_280-282cm Mt. Gahinga 950 AD SL015 GM_GAH7_640-642cm Mt. Gahinga 2790 BC SL022 Extraction blank NTC_SB092P PCR blank NTC_SB094P PCR blank