Skip to main content
Dryad

HG002 DNA (PacBio and ONT) and UHRR RNA (ONT) base modification data for minimod

Data files

Jul 14, 2025 version files 770.64 GB

Select up to 11 GB of files for download

Abstract

Recent advances in third-generation sequencing technologies have enabled the detection of various DNA and RNA base modifications in addition to standard nucleotide sequences. Both major vendors in this space—Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio)—now include base modification information in their sequencing outputs using MM/ML tags embedded in unaligned BAM files. Each vendor also provides dedicated tools for extracting and analysing these tags, such as ONT’s modkit and PacBio’s pb-cpg-tools.

This work presents Minimod, a new vendor-agnostic tool designed to extract and analyse any type of base modification from sequencing data generated by any platform that supports MM/ML tags. Minimod is a free, open-source application written in C and is available in GitHub and Zenodo (see related works). This dataset provides the supporting data used to evaluate Minimod’s performance.