Skip to main content
Dryad

PacBio HiFi based haplotype-aware assemblies of tomato hybrid varieties Funtelle and Maxeza

Cite this dataset

Fuentes, Roven Rommel; van Rengs, Willem M. J.; Wang, Yazhong; Underwood, Charles (2023). PacBio HiFi based haplotype-aware assemblies of tomato hybrid varieties Funtelle and Maxeza [Dataset]. Dryad. https://doi.org/10.5061/dryad.931zcrjs4

Abstract

Modern commercial varieties of tomato (Solanum lycopersicum) are typically F1 hybrids that are genetically heterozygous. Here we generated haplotype-aware assemblies of two different tomato commercial hybrids (Funtelle and Maxeza) using PacBio HiFi reads. The HiFi data was assembled using the Hifiasm assembler allowing for the generation of contigs that distinguish the two parental haplotypes (haplotype-aware assembly). Reference based scaffolding was used to generate the chromosome-scale assemblies available here. It should be noted that although the raw assembly manages to fully distinguish haplotypes we did not test whether the working reference sequence we make available here is fully phased at the chromosome level.

README: PacBio HiFi based haplotype-aware assemblies of tomato hybrid varieties Funtelle and Maxeza

https://doi.org/10.5061/dryad.931zcrjs4

Organism name: Solanum lycopersicum
Cultivar name: Funtelle
Assembly level: Reference-anchored contigs
Assembler: hifiasm + RagTag
Sequencing technology: PacBio HiFi
European Nucleotide Archive project: PRJEB62442

Organism name: Solanum lycopersicum
Cultivar name: Maxeza
Assembly level: Reference-anchored contigs
Assembler: hifiasm + RagTag
Sequencing technology: PacBio HiFi
European Nucleotide Archive project: PRJEB62443

MD5sum:
Funtelle-1.fasta 87a62db087f03992c278fdd56dcd10d7
Funtelle-2.fasta 6c465e49d5072f4003477d0852ad7829
Maxeza-1.fasta 078a41440661f66be779ff6d2b721ff1
Maxeza-2.fasta 6e3100ec131cb8ccec932eea5e6f07ef

Assembly Statistics:

Funtelle1
Contigs 1,155
Total length (Mb) 875.2
Max length (Mb) 76.1
N50 length (Mb) 56.5
N90 length (Mb) 1.3
L50 7
L90 29

Funtelle2
Contigs 651
Total length (Mb) 839.5
Max length (Mb) 69.8
N50 length (Mb) 41.3
N90 length (Mb) 1.8
L50 8
L90 34

Maxeza1
Contigs 1,488
Total length (Mb) 869.4
Max length (Mb) 73.2
N50 length (Mb) 55.7
N90 length (Mb) 2.1
L50 7
L90 25

Maxeza2
Contigs 689
Total length (Mb) 846.2
Max length (Mb) 74.8
N50 length (Mb) 39.4
N90 length (Mb) 4.6
L50 8
L90 27

Methods

The raw PacBio HiFi sequencing reads of the both varieties studied here are available via the European Nucleotide Archive under project numbers PRJEB62442 (Funtelle) and PRJEB62443 (Maxeza).

For scientific correspondence please contact Charles Underwood (cunderwood[@]mpipz.mpg.de)