Skip to main content
Dryad

Data for: Nanopore R10.4.1 LSK114 HG002: subset of 20000 reads in BLOW5 format

Cite this dataset

Gamaarachchi, Hasindu (2023). Data for: Nanopore R10.4.1 LSK114 HG002: subset of 20000 reads in BLOW5 format [Dataset]. Dryad. https://doi.org/10.5061/dryad.905qfttq9

Abstract

HG002 (NA24385) is a reference human genome sample used for benchmarking and comparing bioinformatics applications. This dataset contains a subset of 20,000 reads from the HG002 human reference sample, sequenced using an Oxford Nanopore Technologies PromethION sequencer on an R10.4.1 flowcell. Sheared DNA libraries (~17Kb) were prepared using the ONT LSK114 ligation library prep and an R10.4.1 flow cell was used to generate ~30X genome coverage. The original data in the FAST5 format was converted to BLOW5 format using slow5tools v0.8.0. This is a downsampled subset containing 20,000 reads in BLOW5 format.

Methods

Sheared DNA libraries (~17Kb) were prepared using the ONT LSK114 ligation library prep and a R10.4.1 flow cell was used to generate ~30X genome coverage and this is a downsampled subset containing 20,000 reads in BLOW5 format. FAST5 data was converted to BLOW5 using slow5tools v0.8.0. The uploaded dataset contains a tarball of a merged BLOW5 file and a BLOW5 file index. More information on BLOW5 format can be found at https://doi.org/10.1038/s41587-021-01147-4

Usage notes

Use slow5tools (available: https://github.com/hasindu2008/slow5tools) to view, query and manipulate the dataset. Examples of the use of slow5tools are provided at https://hasindu2008.github.io/slow5tools/