CRISPR-Cas system of a prevalent human gut bacterium reveals hyper-targeting against phages in a human virome catalog
Data files
Feb 26, 2021 version files 6.46 GB
-
HuVirDB-1.0.fasta.gz
6.46 GB
-
README
929 B
Abstract
Bacteriophages are abundant within the human gastrointestinal tract, yet their interactions with gut bacteria remain poorly understood, particularly with respect to CRISPR-Cas immunity. Here, we show that the type I-C CRISPR-Cas system in the prevalent gut Actinobacterium Eggerthella lenta is transcribed and sufficient for specific targeting of foreign and chromosomal DNA. Comparative analyses of E. lenta CRISPR-Cas systems across (meta)genomes revealed 2 distinct clades according to cas sequence similarity and spacer content. We assembled a human virome database (HuVirDB), encompassing 1,831 samples enriched for viral DNA, to identify protospacers. This revealed matches for a majority of spacers, a marked increase over other databases, and uncovered “hyper-targeted” phage sequences containing multiple protospacers targeted by several E. lenta strains. Finally, we determined the positional mismatch tolerance of observed spacer-protospacer pairs. This work emphasizes the utility of merging computational and experimental approaches for determining the function and targets of CRISPR-Cas systems.
Methods
This dataset is a the result of collecting and assembling human metagenomic virome sequencing for the purposes of meta-analysis.
Usage notes
All scaffolds are identified by their SampleID with metadata available at the associated github repository: https://github.com/jbisanz/HuVirDB.