De novo genome assembly of human cell line CHM13 nanopore ultra-long reads using Shasta
Data files
May 26, 2022 version files 5.91 GB
-
Assembly.gfa
2.96 GB
-
AssemblySummary.html
9.76 KB
-
README.txt
2.25 KB
-
shasta_0.9.0_chm13_assembly.fasta
2.96 GB
Abstract
Advances in Oxford Nanopore Technologies (ONT) sequencing, basecalling, and updates to Shasta are outpacing the publishing cycle. We aim to update users on the state of the art using the latest and greatest ONT data assembled with Shasta. This release encompassed our latest assembly of CHM13, the AssemblySummary.html and Assembly.gfa along with our evaluation presented in tables and figures. We assembled ultra-long nanopore reads of CHM13 using Shasta 0.9.0 with the iterative assembly mode to produce a haploid de novo genome assembly.
We downloaded publicly available reads created by the "Telomere-to-Telomere" (T2T) Consortium to assemble CHM13. For a description of sequencing methods by the T2T Consortium, please see https://github.com/marbl/CHM13#oxford-nanopore-data. Release 8 of the data was re-called with Guppy v5.0.7 in super accuracy mode.
T2T CHM13 rel8 reads
We assembled the reads using Shasta 0.9.0 (Shafin et al., 2020) in the iterative assembly mode by calling the -Nanopore-Sep2020 configuration, plus additional command line options listed below. We performed the assembly on McCloud, a service that runs Shasta in the cloud.
Shasta 0.9.0 command line options
--Reads.minReadLength 50000 --Kmers.k 10 --MinHash.minHashIterationCount 100 --Align.minAlignedFraction 0.35 --Align.minAlignedMarkerCount 600 --Align.maxSkip 50 --Align.maxDrift 30 --Align.maxTrim 30 --ReadGraph.creationMethod 0 --ReadGraph.maxAlignmentCount 12 --ReadGraph.crossStrandMaxDistance 0 --MarkerGraph.refineThreshold 0 --MarkerGraph.minCoveragePerStrand 3 --MarkerGraph.simplifyMaxLength 10,100,1000,10000 --Assembly.iterative --Assembly.pruneLength 10000 --Assembly.consensusCaller Bayesian:guppy-5.0.7-a
References
Files
Genome assembly file
Assembly of CHM13 in FASTA format (one strand only).
shasta_0.9.0_chm13_assembly.fasta
Genome assembly summary file
Assembly summary information in html format.
AssemblySummary.html
Graphical fragment assembly file
Assembly in GFA format (one strand only).
Assembly.gfa
Assembly evaluation
Results of our evaluation of the assembly using QUAST and asmgene presented in tables and figures.
shasta_0.9.0_chm13_evaluation.pdf