Genome sequence and characterization of a freshwater photoarsenotroph, Cereibacter azotoformans strain ORIO, isolated from sediments capable of cyclic light-dark arsenic oxidation and reduction
Data files
Oct 31, 2023 version files 8.16 KB
-
fig2AB_data_dryad.csv
-
fig2C_data_dryad.csv
-
fig4_data_dryad.csv
-
README.md
Abstract
A freshwater photosynthetic arsenite-oxidizing bacterium, Cereibacter azotoformans strain ORIO, was isolated from Owens River, CA, USA. The waters from Owens River are elevated in arsenic and serve as the headwaters to the Los Angeles Aqueduct. The complete genome sequence of strain ORIO is 4.8 Mb genome (68 % G+C content) and comprises 2 chromosomes and 6 plasmids. Taxonomic analysis placed ORIO within the Cereibacter genus (formerly Rhodobacter). The ORIO genome contains arxB2AB1CD (encoding an arsenite oxidase), arxXSR (regulators), and several ars arsenic resistance genes all co-localized on a 136 kb plasmid, named pORIO3. Phylogenetic analysis of ArxA, the molybdenum-containing arsenite oxidase catalytic subunit, demonstrated photoarsenotrophy is likely to occur within members of the Alphaproteobacteria. ORIO is a mixotroph, oxidizes arsenite to arsenate photoheterotrophically, and expresses arxA in cultures grown with arsenite. Further ecophysiology studies with Owens River sediment demonstrated the interconversion of arsenite and arsenate was dependent on light-dark cycling. arxA and arrA (arsenate respiratory reductase) genes were detected in the light-dark cycled sediment metagenomes suggesting syntrophic interactions among arsenotrophs. This work establishes Cereibacter azotoformans str. ORIO as a new model organism for studying photoarsenotrophy and light-dark arsenic biogeochemical cycling.
README: Genome sequence and characterization of a freshwater photoarsenotroph, Cereibacter azotoformans strain ORIO
Dataset contents include:
Sediment metagenomes from Owens River, an arsenic-rich freshwater environment in California (USA):
- https://www.ncbi.nlm.nih.gov/bioproject/814312: Genome sequencing and assembly for 12 MAGs and 3 SRA of the Oxford Nanopore Minion raw data.
The Cereibacter azotoformans str. ORIO genome sequencing assembly and raw data:
- https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_022227035.1/: all the information regrading the ORIO genome can be found through this link at the NCBI. The refSeq is GCF_022227035.1. The BioSample ID is SAMN20253214. The SRA is SAMN20253214.
Growth curve data:
- fig2AB_data_dryad.csv: contains the mean OD600nm values of a time course growth experiment conducted with Cereibacter azotoformans strain ORIO testing the effects of either arsenate or arsenite on aerobic growth. The csv file is read into the jupyter notebook (fig_2_growth_curve_dryad.ipynb) that renders the figures shown in Figure 2AB of the manuscript.
- fig2C_data_dryad.csv: this data set contains the OD600nm values of a time course growth experiment testing how increasing levels of arsenite affect the anaerobic growth of Cereibacter azotoformans strain ORIO. The csv file is read into the jupyter notebook (fig_2_growth_curve.ipynb) and renders Figure 2C of the manuscript.
- fig4_data_dryad.csv: the data is used to produce Figure 4 of the manuscript. This was a microcosm experiment was done with three replicates containing Owens River sediments and Little Hot Creek water spiked with 50 µM arsenite. The three microcosms were first incubated in the dark, then shifted into the light, then shifted back to the dark, and finally shifted into the light. Over the time course of light shifting, arsenate and arsenite were measured by HPLC-ICPMS in each of the three replicate microcosms. The jupyter notebook, fig_4_light_dark_cycling_dryad.ipynb can be used with figure4_data_dryad.csv to generate the plot shown in Figure 4 of the manuscript.
Sharing/Access information
Links to publicly accessible locations of the various genome sequencing data (as described above):
- https://www.ncbi.nlm.nih.gov/bioproject/PRJNA814312
- https://www.ncbi.nlm.nih.gov/bioproject/808117
- https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_022227035.1/
Data was derived from the following sources:
- Laboratory experiments conducted in the Saltikov lab at UC Santa Cruz
Code/Software
Python jupyter notebook scripts are provided for generating the plots in the figures:
- fig_2_growth_curve_dryad.ipynb
- fig_4_light_dark_cycling_dryad.ipynb
Methods
The published paper describes data generation and collection. Briefly the methods include:
- Enrichment cultring of anoxygenic photosynthetic microbes
- Growth curve analyses
- Arsenic analyses by HPLC-ICPMS (Thermo X-Series2 ICP-MS and Hamilton PRP-X-100 HPLC column for arsenic speciation)
- Genomic DNA sequencing by MiSeq and Oxford Nanopore
- Metagenomic DNA sequencing by Oxford Nanopore
Usage notes
The manuscript describes specific workflows and pipelines. Briefly the software includes:
- Oxford Nanopore Guppy v5.0.17
- Assembly using PATRIC, now called BACTERIAL AND VIRAL BIOINFORMATICS RESOURCE CENTER https://www.bv-brc.org/
- ToulligQC v2.1.1
- Racon
- Phylogeny using custom R script that can be found at https://github.com/csaltikov/arsenotrophy_phylogeny.
- Various python scripts for producing data visualizations in the manuscript