Data from: Genomic insights reveal zoonotic potential of Morganella morganii strains from diarrheal patients in Dhaka, Bangladesh
Data files
Jan 09, 2026 version files 138.71 KB
-
Morganella_Supplementary_Files.zip
129.85 KB
-
README.md
8.86 KB
Abstract
Morganella morganii, a Gram-negative opportunistic pathogen, is increasingly recognized as a cause of nosocomial and community-acquired infections; however, comprehensive genomic studies from Bangladesh remain limited. In this study, seven M. morganii isolates from diarrheal stool samples and three from environmental sources in Dhaka were subjected to whole-genome sequencing to investigate their genomic diversity, antimicrobial resistance (AMR) profiles, virulence factors, and host-pathogen protein-protein interactions (HP-PPIs). Genome sizes ranged from 3.75 to 4.09 Mbp, with an average GC content of ~51 %. AMR analysis revealed diverse array of resistance genes, including β-lactamases (blaCTX-M-15, blaOXA-1), aminoglycoside-modifying enzymes (aadA5, aph(6)-Id, aac(6’)-Ib-cr), fluoroquinolone resistance genes (qnrB4), sulfonamide resistance genes (sul1, sul2), and tetracycline resistance determinants (tet(A), tet(D)). Interolog-based predictions identified 3,920 potential HP-PPIs, including bacterial transketolase interacting with human NF-κB (NFKB1), suggesting immunomodulatory capabilities. Importantly, phylogenomic analysis revealed clustering of clinical isolates with strains from animal and environmental sources, indicating potential zoonotic transmission of M. morganii. This study reveals key genomic and phenotypic features of M. morganii in Bangladesh and underscores the importance of surveillance and targeted control strategies for this emerging pathogen.
Dataset DOI: 10.5061/dryad.tht76hfc5
Overview
This repository contains datasets supporting the study “Genomic insights reveal zoonotic potential of Morganella morganii strains from diarrheal patients in Dhaka, Bangladesh." We sequenced and analyzed M. morganii isolates collected in Dhaka, Bangladesh, from clinical diarrheal stool samples and environmental sources. The files include isolate metadata, antimicrobial susceptibility testing (AST) results, antimicrobial resistance (AMR) gene calls, virulence gene hits, average nucleotide identity (ANI) results, predicted human pathogenic capacity, mobile genetic element outputs, predicted genomic islands, and functional annotation tables (eggNOG/COG categories).
Important note about formats: All tabular data are provided as CSV (comma-separated values) to maximize accessibility and reusability.
Files and variables
Files included
- Supplementary_File_1__*.csv — isolate metadata, AST results, AMR gene lists, sequencing metrics.
- Supplementary_File_2__*.csv — ANI (FastANI), comparative public genome metadata, PathogenFinder outputs.
- Supplementary_File_3__*.csv — virulence gene hits and biofilm assay results.
- Supplementary_File_4__*.csv — integrons, insertion sequences, genomic islands, composite transposon/IS hits, MobileElementFinder summary.
- Supplementary_File_5__<COG_letter>.csv — eggNOG-mapper functional annotation tables, grouped by COG functional category.
Common identifiers used across files
- Isolate / sample identifiers (e.g., NGCRVN-02, NGEE-862A) are used consistently across files.
- For gene-level outputs, contig/replicon identifiers may appear (e.g., NGCRVN-02_contig_1).
Missing data codes and conventions
- Empty cells or NA indicate not available or not detected (depending on context).
- Gene lists in some tables are stored as semicolon-separated or comma-separated text within a single cell.
- Coordinates are reported as output by the respective tools (typically 1-based and inclusive).
Abbreviations
- AMR: antimicrobial resistance
- AST: antimicrobial susceptibility testing
- ANI: average nucleotide identity
- VFDB: Virulence Factor Database
- IS: insertion sequence
- GI: genomic island
- COG: Clusters of Orthologous Groups
File-by-file descriptions and data dictionaries
Supplementary File 1 tables (phenotypes, metadata, basic metrics)
Supplementary_File_1__S1_Morganella_AST.csv
Description: Disc diffusion AST results (presence/absence) for each isolate.
Supplementary_File_1__S2_Morganella_AMR_Genes.csv
Description: Gene-level AMR calls per isolate, including β-lactamases, aminoglycoside-modifying enzymes, fluoroquinolone determinants, sulfonamide, and tetracycline resistance markers.
Supplementary_File_1__S3_Source_Date.csv
Description: Isolate-level metadata (source and collection date).
Supplementary_File_1__S4_Sequencing_Metrics.csv
Description: Per-sample sequencing / coverage summary metrics.
Supplementary File 2 tables (ANI + comparative context)
Supplementary_File_2__S1_FASTANI.csv
Description: Pairwise ANI results for the ten study isolates against the reference genome GCF_002968775; columns include Genome, FastANI Score (%), and Reference Genome.
Supplementary_File_2__S2_Publicly_Available_M_morgan.csv
Description: Metadata table for the 90 comparative public M. morganii genomes from 18 countries; fields include Assembly_Accession, Strain, Isolation_Source, Country, Host, and Host_Disease (where available).
Supplementary_File_2__S3_PathogenFinder.csv
Description: Predicted human pathogenic capacity for each genome based on PathogenFinder.
Supplementary File 3 tables (virulence + biofilm)
Supplementary_File_3__S1_Virulence_Genes.csv
Description: Presence/absence matrix of virulence determinants per strain, including locus tags, gene symbols, and functional categories.
Supplementary_File_3__S2_Biofilm_Results.csv
Description: Summary of microtiter plate biofilm assay results.
Supplementary File 4 tables (mobilome, integrons, islands, transposon/IS)
Supplementary_File_4__S1_Integron_Finder.csv
Description: IntegronFinder output summarizing integron and attC elements.
Supplementary_File_4__S2_ISE_Scan.csv
Description: ISEScan output for insertion sequence (IS) predictions.
Supplementary_File_4__S3_Genomic_Island.csv
Description: Predicted genomic island content (IslandViewer4-style output). Each row corresponds to a gene within a predicted genomic island; island boundaries repeat for all genes within the same island.
Supplementary_File_4__S4_Composite_Transposon_Tn_F.csv
Description: Composite transposon / insertion-sequence alignment summary (derived from tool output). Column names were added for clarity.
Supplementary_File_4__S5_Mobile_Element_Finder.csv
Description: MobileElementFinder summary per isolate.
Supplementary File 5 tables (eggNOG functional annotations by COG category)
Files:
Supplementary_File_5__D.csv, Supplementary_File_5__F.csv, Supplementary_File_5__H.csv, Supplementary_File_5__I.csv, Supplementary_File_5__J.csv, Supplementary_File_5__O.csv, Supplementary_File_5__Q.csv, Supplementary_File_5__T.csv, Supplementary_File_5__V.csv
Description: eggNOG-mapper v2 annotation outputs for selected genes, grouped by COG functional category letter. These file summarizes the comparative pan-genome and COG enrichment analyses of M. morganii isolates, highlighting accessory genes unique to diarrheal strains. It includes functional annotations based on eggNOG-mapper v2 and statistical enrichment results (GeneRatio, p-values, FDR, and log₂ odds ratios), revealing key defense- and adaptation-related functional groups enriched in the diarrheal accessory genome.
COG category letters used here
- D: Cell cycle control, cell division, chromosome partitioning
- F: Nucleotide transport and metabolism
- H: Coenzyme transport and metabolism
- I: Lipid transport and metabolism
- J: Translation, ribosomal structure and biogenesis
- O: Posttranslational modification, protein turnover, chaperones
- Q: Secondary metabolites biosynthesis, transport and catabolism
- T: Signal transduction mechanisms
- V: Defense mechanisms
Code/software
The datasets were generated using open-source bioinformatics tools:
Read QC, assembly & coverage.
- fastp v0.23.4; FastQC v0.12.0
- SPAdes v3.15.0 (de novo assembly)
- minimap2 v2.28, samtools v1.19.2 (read mapping & BAM handling)
- mosdepth v0.3.6 (depth/coverage)
Annotation & species confirmation.
- Prokka v1.14.6 (genome annotation)
- KmerFinder v3.2; FastANI (species/ANI confirmation)
AMR, virulence & mobilome profiling.
- ResFinder v4.7.2; CARD (database 2024‑Dec‑15); ARG‑ANNOT (database 2024‑Dec‑15)
- VFDB (core dataset update 2024)
- MobileElementFinder v1.0.3; PlasmidFinder v2.1.6
- ISEScan v1.7.2.3 (IS elements); IslandViewer 4 (genomic islands)
- IntegronFinder v2.0.6; TnComp_finder (transposons)
- CRISPRCasFinder (CRISPR–Cas loci)
Phylogenomics & SNP analysis.
- Snippy v4.6.0 (core‑SNP calling)
- IQ‑TREE v3.0.1 (maximum‑likelihood phylogeny)
Pan‑genome & functional enrichment.
- Roary v3.13.0 (pan‑genome)
- eggNOG‑mapper v2 (functional annotation)
- clusterProfiler v4.12.6 (COG/KEGG enrichment)
Host–pathogen protein–protein interactions (HP‑PPIs).
Interolog‑based predictions mapped M. morganii proteins to known human interactors curated in PHISTO. Protein sets were redundancy‑reduced with MMseqs2 v13.45111 (easy‑cluster, min ID 0.9, coverage mode 1, cluster coverage 0.8). Cluster representatives were aligned to PHISTO bacterial proteins with BLAST+ v2.9.0‑2 using thresholds of query coverage ≥ 70 %, identity ≥ 30 %, e‑value ≤ 1e‑4, and bit score > 50 (max 1 target per query). Human partner proteins were retrieved from UniProt. Network construction and topology (e.g., degree centrality) were computed in NetworkX v2.8.8, and KEGG pathway enrichment of human targets was performed with clusterProfiler v4.12.6.
Unless otherwise noted, tools were run with default parameters; complete version pins and parameter choices are documented in the Methods.
Ethical Approval
All experimental protocols and samples collection were approved by the North South University (NSU) Institutional Review Board (IRB) /Ethical Review Committee (ERC) under the protocol number CTRG: NSU-RP-21-042.
