Unraveling community adaption and survival strategy of soil microbiome under vanadium stress in nationwide mining environments
Data files
Oct 20, 2023 version files 4.77 MB
-
All_Data.zip
-
annotation_table_for_keystone_taxa_function.zip
-
Carbon_fixation_marker_genes_0928.xlsx
-
KO_summary.CSV
-
MAG_quality.CSV
-
R_Script.zip
-
README.md
-
Sequence_data.txt
Feb 01, 2024 version files 107.40 GB
-
All_Data.zip
-
annotation_table_bin11_Rubrobacter.csv
-
annotation_table_bin136_nocardioide.csv
-
annotation_table_bin22_nocardioides.csv
-
annotation_table_bin267_micromonospora.csv
-
annotation_table_bin302_mycobacterium.csv
-
annotation_table_bin305_nocardioide.csv
-
annotation_table_bin310_mycobacterium.csv
-
annotation_table_bin431_steroidobacter.csv
-
annotation_table_bin54_FEN1250.csv
-
annotation_table_bin6_FEN1250.csv
-
annotation_table_bin70_FEN1250.csv
-
annotation_table_for_keystone_taxa_function.zip
-
Carbon_fixation_marker_genes_0928.xlsx
-
fastq_file_name.CSV
-
KO_summary.CSV
-
L1EFI270346--HSW19.R1.raw.fastq.gz
-
L1EFI270346--HSW19.R2.raw.fastq.gz
-
L1HFI240421--CC15.R1.raw.fastq.gz
-
L1HFI240421--CC15.R2.raw.fastq.gz
-
L1HFI240422--CC27.R1.raw.fastq.gz
-
L1HFI240422--CC27.R2.raw.fastq.gz
-
L1HFI240423--CC34.R1.raw.fastq.gz
-
L1HFI240423--CC34.R2.raw.fastq.gz
-
L1HFI240424--CC35.R1.raw.fastq.gz
-
L1HFI240424--CC35.R2.raw.fastq.gz
-
L1HFI240425--CC7.R1.raw.fastq.gz
-
L1HFI240425--CC7.R2.raw.fastq.gz
-
L1HFI240426--EC14.R1.raw.fastq.gz
-
L1HFI240426--EC14.R2.raw.fastq.gz
-
L1HFI240427--EC9.R1.raw.fastq.gz
-
L1HFI240427--EC9.R2.raw.fastq.gz
-
L1HFI240428--NE9.R1.raw.fastq.gz
-
L1HFI240428--NE9.R2.raw.fastq.gz
-
L1HFI240429--NW18.R1.raw.fastq.gz
-
L1HFI240429--NW18.R2.raw.fastq.gz
-
L1HFI240430--SC1.R1.raw.fastq.gz
-
L1HFI240430--SC1.R2.raw.fastq.gz
-
L1HFI240431_L1HFI240431_L1HFI240431_L1HFI240431--SW10.R1.raw.fastq.gz
-
L1HFI240431_L1HFI240431_L1HFI240431_L1HFI240431--SW10.R2.raw.fastq.gz
-
L1HFI240433--SW8.R1.raw.fastq.gz
-
L1HFI240433--SW8.R2.raw.fastq.gz
-
MAG_quality.CSV
-
R_Script.zip
-
README.md
-
Sequence_data.txt
Abstract
The vanadium (V) smelters soil harbor wide ranges of microorganisms, whose survival relies on their metabolic activities under stress. Nonetheless, the characteristics and functions of soil microbiome in V mining environments have not been recognized at a continental scale. This study investigates microbial diversity, community assembly and metabolic traits of soil microbiome across 90 V smelters in China. A decrease in alpha diversity is observed, along with community variation, which is also jointly explained by other environmental, climatic and geographic factors. Null model shows that V promotes homogeneous selection. V also mediates co-occurrence patterns, with increased positive interspecific associations under higher V concentrations (>559.6 mg/kg), e.g., f_Gemmatimonadaceae, Nocardioides, Micromonospora, Rubrobacter. In addition, 67 metagenome assembled genomes are retrieved via metagenomic analysis. The metabolic pathways of keystone taxa are disentangled to reveal their putative involvement in the V(V) reduction process. Nitrate and nitrite reductase (nirK, narG), and mtrABC are found to be taxonomically affiliated with Micromonospora. sp, FEN-1250. sp, Nocardioides. sp, etc. Additionally, reverse citric acid cycle (rTCA) serves the main carbon fixation pathway, synthetizing alternative energy for putative V reducers, highlighting a synergistic relationship between autotrophic and heterotrophic processes to support the microbial survival. Our findings comprehensively reveal the driving forces for soil community variation under V stress, suggesting the robust strategies adopted by indigenous microorganisms to alleviate V impact, which can be exploited for bioremediation application.
README: Unraveling community adaption and survival strategy of soil microbiome under vanadium stress in nationwide mining environments
https://doi.org/10.5061/dryad.6wwpzgn52
Four types of datasets are included herein:
1. Sites information (physicochemical measurements)
2. 16S rRNA (community profile at OTU and genera level)
3. Metagenome (functional genes profile, MAG results, annotation tables for keystone taxa, carbon fixation marker genes, naming convention)
4. R script
5. fastq files for metagenome sequence for binning
Description of the data and file structure
The datasets herein serve as the basis to perform data analysis for this article. Physicochemical data was obtained from actual measurements, which was used to perform analysis to uncover the key driving factors for shaping the community structure, and implemented in models such as Variation partitioning analysis (VPA), Mantel’s test, and Random forest (RF). Community assembling process was characterized using Neutral community modeling (NCM) and Null model (NM) using 16S rRNA data. All mentioned models were performed by executing the R script, which were also provided here. Functional genes were obtained from metagenomic investigations. 67 qualified Metagenome assembled genome (MAG) were also retrieved from metagenomic binning, among which keystone taxa were focused and their functional potential were annotated.
The detailed description of each uploaded datasets and software is provided below:
1. Physicochemical parameter
Physicochemical parameter measurements of soil samples collected from smelter sites.
2. Sequence data
The webpage link to obtain the 16S rRNA sequence data from NCBI Sequence Read Archive (SRA) database (http://www.ncbi.nlm.nih.gov/sra)
3. Metagenome data, including:
- KO summary, the functional profile of metagenome investigation.
- MAG quality, the results of binning process, including all qualified metagenome assembled genomes
- Annotation tables, provide the full annotation of functional genes for all keystone genera identified, bin6/11/22/54/70/136/267/302/305/310/431.
- Carbon fixation marker genes, provide all genes related with carbon fixation process and central metanolism process for all keystone genera, with the marker genes highlighted.
- All fastq data files for 13 MAGs retrieved (2 subfiles per 1 MAG) from binning process of metagenomic sequence. For fastq file, old file names were used, therefore we have attachd a separate csv file to provide with new sample names)
- SML-1 (L1HFI240433)
- SML-2 (L1HFI240431)
- SML-3 (L1HFI240425)
- SMM-1 (L1HFI240428)
- SMM-2 (L1HFI240427)
- SMM-3 (L1HFI240429)
- SMM-4 (L1EFI270346)
- SMM-5 (L1HFI240426)
- SMH-1 (L1HFI240422)
- SMH-2 (L1HFI240430)
- SMH-3 (L1HFI240421)
- SMH-4 (L1HFI240423)
- SMH-5 (L1HFI240424)
4. R scripts (software), including:
- RF_0130, random forest analysis using "RandomForest" function to support the creation of sample groups by screening the important driving factors for community structure based on percentage increases in the MSE (mean squared error).
- VPA1, variable partitioning analysis using "Vegan" package to quantify the relative contribution of explanatory variables on community variation.
- Mantel7, Mantel's test using "Vegan" package to associate influence variables with community structure and disentangle the interactive patterns between individual determinants.
- NCM_0111, using R package “Hmisc, stat4 and minpack.lm” to predict the relationship between OTU detection frequency and their relative abundance.
- bNTI_Rscript_0116, using “picante” to calculate the standardized effect size measure of the mean nearest taxon distance (SES.MNTD) and β mean nearest taxon distance (βMNTD).
- Beta_RC, using “raup_crick.dist” function to partitioning the relative influence of stochastic process (dispersal limitation, undominated, homogeneous dispersal).
Sharing/Access information
Links to other publicly accessible locations of the data:
Code/Software
List of software and R script used for data analysis:
R scripts:
- R (version 4.2.1) package “RandomForest” was used for performing Random Forest analysis.
- R (version 4.2.1) package “Vegan” was used for conducting Mantel’s test and variable partitioning analysis.
- R (version 4.2.1) package “Hmisc, stat4 and minpack.lm” were used to examine the influence of neutral process on community assembling process.
- R (version 4.2.1) package “picante” and ““raup_crick.dist” were used to characterize the community assembling process.
All abovementioned script were provided herein.
Methods
The datasets herein serve as the basis to perform data analysis for this article. Physicochemical data was obtained from actual measurements, which was used to perform analysis to uncover the key driving factors for shaping the community structure, and implemented in models such as Variation partitioning analysis (VPA), Mantel’s test, and Random forest (RF). Community assembling process was characterized using Neutral community modeling (NCM) and Null model (NM) using 16S rRNA data. All mentioned models were performed by executing the R script, which were also provided here. Functional genes were obtained from metagenomic investigations. 67 qualified Metagenome assembled genome (MAG) were also retrieved from metagenomic binning, among which keystone taxa were focused and their functional potential were annotated.