Skip to main content
Dryad

Prophage-DB: A comprehensive database to explore diversity, distribution, and ecology of prophages

Data files

Jun 27, 2024 version files 1.49 GB
Jul 18, 2024 version files 2.49 GB

Abstract

Background:

Viruses that infect prokaryotes (phages) constitute the most abundant group of biological agents, playing pivotal roles in microbial systems. They are known to impact microbial community dynamics, microbial ecology, and evolution. Efforts to document the diversity, host range, infection dynamics, and effects of bacteriophage infection on host cell metabolism are still at the surface level. Among phages, some adopt the lysogenic mode of infection, where the genome integrates into the host cell genome, forming a prophage. Prophages enable viral genome replication without host cell lysis and often contribute novel and beneficial traits to the host genome. Despite their importance, research on prophages is limited. Current phage research predominantly focuses on lytic phages, leaving a significant gap in knowledge regarding prophages, including their biology, diversity, and ecological roles.

Results:

To bridge this gap, the creation of Prophage-DB, a prophage database, aims to address the limited knowledge of these crucial biological entities. To create the database, we identified lysogenic viruses from genomes in three publicly available databases. We applied several state-of-the-art tools in our pipeline to annotate these viruses, cluster them, taxonomically classify them, and detect their respective AMGs. With our approach, we identified over 350,000 prophages and 35,000 auxiliary metabolic genes.

Conclusion:

By summarizing the collected information we have created a database with extensive metadata regarding phage and host taxonomy, host information, and auxiliary metabolic genes. We identified numerous phages, from a wide variety of archaeal and bacterial hosts, which show a wide environmental distribution. In addition, the identified auxiliary metabolic genes will improve our understanding of them given the context of our study. We estimate this comprehensive prophage database will be a valuable resource for advancing prophage research, offering insights into viral taxonomy, host relationships, auxiliary metabolic genes, and environmental distribution. Its use promises to contribute towards understanding microbial ecosystems and unlocking the mysteries of microbial dark matter.