Skip to main content

Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka, Russia

Cite this dataset

Ettinger, Cassandra; Wilkins, Laetitia; Jospin, Guillaume; Eisen, Jonathan (2019). Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka, Russia [Dataset]. Dryad.


Culture-independent methods have contributed substantially to our understanding of global microbial diversity. Recently developed algorithms to construct whole genomes from environmental samples have further refined, corrected and revolutionized understanding of the tree of life. Here, we assembled draft metagenome-assembled genomes (MAGs) from environmental DNA extracted from two hot springs within an active volcanic ecosystem on the Kamchatka peninsula, Russia. This hydrothermal system has been intensively studied previously with regard to geochemistry, chemoautotrophy, microbial isolation, and microbial diversity. We assembled genomes of bacteria and archaea using DNA that had previously been characterized via 16S rRNA gene clone libraries. We recovered 36 MAGs, 29 of medium to high quality, and inferred their placement in a phylogenetic tree consisting of 3,240 publicly available microbial genomes. We highlight MAGs that were taxonomically assigned to groups previously underrepresented in available genome data. This includes several archaea (Korarchaeota, Bathyarchaeota and Aciduliprofundum) and one potentially new species within the bacterial genus Sulfurihydrogenibium. Putative functions in both pools were compared and are discussed in the context of their diverging geochemistry. This study adds comprehensive information about phylogenetic diversity and functional potential within two hot springs in the caldera of Kamchatka.


In this study we focus on two hydrothermal pools, Arkashin Shurf and Zavarzin Spring, in Uzon Caldera, Kamchatka, Russia that were previously characterized using 16S ribosomal RNA sequencing and geochemical analysis undertaken in Burgess et al. 2012 (DOI: 10.1007/s00248-011-9979-4). First we produced metagenome-assembled genomes (MAGs) and then we characterized and compared them between the two pools. Raw metagenomic reads were deposited on NCBI’s GenBank under BioProject ID PRJNA419931 and BioSample IDs SAMN08105301 and SAMN08105287; i.e., SRA IDs SRS2733204 (SRX3442520) and SRS2733205 (SRX3442521). Draft MAG’s were deposited in GenBank under accession numbers SAMN08107294 - SAMN08107329 (BioProject ID PRJNA419931). Here we have included fasta files for all binned MAGs and the associated anvi'o files used to generate them. We have included all the supplemental tables associated with the analysis here as well.

Environmental DNA used here is the same DNA that was extracted in Burgess et al. (DOI: 10.1007/s00248-011-9979-4). In short, Burgess et al. extracted DNA from sediment from Arkashin Schurf, collected in the field in 2004 and from sediment from Zavarin Spring collected in 2005. Sequencing library preparations for Solexa3 84bp paired-end sequencing were performed by the UC Davis Genome Center DNA Technologies Core Facility where the samples were sequenced on two lanes. Demultiplexed data was quality filtered using bbMap v. 36.99 ( with the following parameters: qtrim = rl, trimq = 10, minlength = 70. Adaptors were removed and reads were assembled into two metagenomes (one for Arkashin Schurf and one for Zavarzin Spring) using SPAdes v. 3.9.0 with default parameters (DOI: 10.1101/gr.213959.116). Metagenomic data was then binned into metagenome-assembled genomes (MAGs) using anvi’o v. 2.4.0 (DOI: 10.7717/peerj.1319). Here we have included fasta files for all generated MAGs and associated anvi'o files.

Usage notes

Please see included CSV which includes a key to identify draft MAG's as well as a summary of individual MAG assembly statistics, quality measurements and taxonomic inferences.