Skip to main content
Dryad

Comparative genomic analysis identifies potential adaptive variation in Mycoplasma ovipneumoniae

Cite this dataset

Andrews, Kimberly et al. (2024). Comparative genomic analysis identifies potential adaptive variation in Mycoplasma ovipneumoniae [Dataset]. Dryad. https://doi.org/10.5061/dryad.ffbg79d2h

Abstract

Mycoplasma ovipneumoniae is associated with respiratory disease in wild and domestic Caprinae globally, with wide variation in disease outcomes within and between host species. To gain insight into phylogenetic structure and mechanisms of pathogenicity for this bacterial species, we compared M. ovipneumoniae genomes from six countries (Australia, Bosnia and Herzegovina, Brazil, China, France, USA) and four host species (domestic sheep, domestic goats, bighorn sheep, caribou).

README: Comparative genomic analysis identifies potential adaptive variation in Mycoplasma ovipneumoniae

https://doi.org/10.5061/dryad.ffbg79d2h

Description of the data and file structure

This dataset includes all Mycoplasma ovipneumoniae genome assemblies that were used in this study, including 74 assemblies that were generated for the study, and an additional 25 assemblies from prior studies that were publicly available on NCBI at the time of the study. Metadata for each assembly is provided in Table S1. The assemblies that were generated for this study are indicated by the presence of "Y" in the "New_assemblies" column of Table S1. Assemblies are provided in fasta format, with filenames corresponding to the "Sample" column in Table S1. The NCBI accessions that correspond to each sample can be found in Table S1.

The content for each column in Table S1 is described below:

Sample: Sample name used in this study.

LabID: Identifier used during laboratory work for each sample in this study. "NA" indicates samples that did not have a LabID.

New_assemblies: "Y" indicates a new assembly generated for this study. Empty cells indicate assemblies that were not generated for this study and were publicly available on NCBI at the time of the study.

Accession_BioSample: NCBI BioSample accession ID.

Accession_GenBank: NCBI GenBank accession ID for the assembly. "NA" indicates an assembly that was not generated for this study.

Accession_NCBI_Assembly: NCBI Assembly accession ID. "NA" indicates an accession ID is not available.

Accession_SRA_Illumina: NCBI SRA accession ID for Illumina sequence data. "NA" indicates samples which do not have Illumina sequence data.

Accession_SRA_ONT: NCBI SRA accession ID for Oxford Nanopore sequence data. "NA" indicates samples which do not have Oxford Nanopore sequence data.

Accession_SRA_PacBio: NCBI SRA accession ID for Pacific Biosciences sequence data. "NA" indicates samples which do not have Pacific Biosciences sequence data.

Clade: Major phylogenetic grouping for the assembly.

HostSpecies: Host species from which the sample was collected.

Population: Population from which the sample was collected. Captive = captive bighorn sheep that was infected at South Dakota State University (SDSU) with a Nevada-origin bighorn sheep strain. "NA" indicates unavailable information.

Division: State or Province where sample was collected. "NA" indicates unavailable information.

Country: Country where sample was collected.

YearCollected: Year the sample was collected. "NA" indicates unavailable information.

DateCollected: Date the sample was collected. "NA" indicates unavailable information.

AssemblyType: Type of sequencing technology used to generate the assembly. "ONT" = Oxford Nanopore Technology.

AssemblyLength (bp): Total length of the assembly in base pairs.

ONTDepth: Assembly depth of coverage for Oxford Nanopore Technology (ONT) sequence reads. "NA" = ONT sequencing was not conducted for this assembly.

ONTStDevDepth: Standard deviation of assembly depth of coverage for Oxford Nanopore Technology (ONT) sequence reads. "NA" = ONT sequencing was not conducted for this assembly.

IlluminaDepth: Assembly depth of coverage for Illumina sequence reads. "NA" = Illumina sequencing was not conducted for this assembly.

IlluminaStDevDepth: Standard deviation of assembly depth of coverage for Illumina sequence reads. "NA" = Illumina sequencing was not conducted for this assembly.

PacBioDepth: Assembly depth of coverage for Pacific Biosciences sequence reads. "NA" = PacBio sequencing was not conducted for this assembly.

Roche454Depth: Assembly depth of coverage for Roche 454 sequence reads. "NA" = Roche 454 sequencing was not conducted for this assembly.

n50 (bp): N50 contig length for the assembly in base pairs.

ncontigs: Number of contigs in the assembly.

Completeness (%): Completeness of the assembly based on CheckM analysis.

Contamination (%): Contamination level for the assembly based on CheckM analysis.

Sharing/Access information

Sequence data generated for this study have been deposited with the National Center for Biotechnology Information (NCBI) under BioProject number PRJNA1070810. Analyses for this study also used 25 genome assemblies that were generated in prior studies and are publicly available on NCBI. The NCBI accessions that correspond to each assembly can be found in Table S1.

Code/Software

Analysis code is available at https://github.com/kimandrews/Movi and an interactive phylogeny is available at https://nextstrain.org/community/narratives/kimandrews/Movi

Methods

Deep nasal swab samples were collected from domestic sheep, domestic goats, bighorn sheep, and caribou. Swab samples were selectively enriched for Mycoplasma ovipneumoniae using a two-step procedure involving incubation in mycoplasma broth, followed by incubation in SP4 broth with glucose. Genomic DNA was extracted using the QIAamp DNA Mini Kit (Qiagen). Whole-genome shotgun libraries were prepared using Illumina Nextera DNA kits and sequenced on an Illumina MiSeq using the v3 600 cycle sequencing kit. Oxford Nanopore Technologies (ONT) shotgun sequencing was also performed for samples with sufficient genomic DNA remaining after Illumina sequencing. Barcoded ONT libraries were created using the SQK-LSK109 library prep kit, and sequencing was performed for 48 hours on FLO-MIN106D flowcells. 

Raw Illumina sequence reads were cleaned using HTStream to filter PhiX reads, PCR duplicates, adapter sequences, and low-quality reads. Raw ONT sequence reads were demultiplexed by barcode and base-called using Guppy v.4.2.2. For samples with ONT sequence data, genome assembly was performed using Trycycler v0.4.1, and assemblies were polished using cleaned Illumina sequence reads with Pilon v.1.23. For samples with only Illumina sequence data and no ONT sequence data, cleaned Illumina reads were assembled de novo using SPAdes v3.13.1.

Comparative genomic analyses were performed using the assemblies generated in this study, along with additional Mycoplasma ovipneumoniae genome assemblies from prior studies that were publicly available on NCBI at the time of the study.

Funding

Idaho Department of Fish and Game

University of Idaho, Institute for Interdisciplinary Data Sciences