Gene annotations and optical maps of the Aegilops tauschii acc. AL8/78 assembly Aet v6.0
Data files
Nov 11, 2024 version files 60.79 MB
-
al878_dls_45M.cmap
53.84 MB
-
annotations.zip
6.95 MB
-
README.md
1.02 KB
Dec 13, 2024 version files 91.01 MB
-
Aetv6_annnotation.zip
37.17 MB
-
al878_dls_45M.cmap
53.84 MB
-
README.md
1.12 KB
Abstract
Aegilops tauschii is the donor of the D subgenome of hexaploid wheat and a valuable genetic resource for bread wheat improvement. Several reference-quality genome sequences have been developed for Ae. tauschii accession AL8/78. Here, we report for this accession a new assembly (Aet v6.0) employing long-read Pacific Biosciences HiFi sequencing technology and new optical map. The N50 contig length of 31.81 Mb achieved with the HiFi technology greatly exceeded that of the previous assembly Aet v5.0. Sequence scaffolds and super-scaffolds were assembled using a hybrid approach employing a new optical map for accession AL8/78. Of 1,254 super-scaffolds assembled, 92 super-scaffolds comprising 97.99% of the total super-scaffold length were anchored on a high-resolution genetic map, thereby producing seven pseudomolecules. The number of gaps in the pseudomolecules was reduced from 52,910 in Aet v5.0 to 351 in Aet v6.0 with a concomitant increase in the effective length of the Aet v6.0 pseudomolecules. Contiguous pseudomolecules facilitated correcting assembly errors present in Aet v5.0. Gene models were transferred from Aet v5.0 onto the Aet v6.0 assembly.
https://doi.org/10.5061/dryad.bvq83bkjg
Description of the data and file structure
Files and variables
File: Aetv6_annnotation.zip
Description: Aetv6_annnotation.zip contains two .gff files, one for high confidence genes (Aetv6_HC.gff) and another for low confidence genes (Aetv6_LC.gff), two CDs files (Aetv6_HC_cds.fa and Aetv6_LC_cds.fa), two protein annotations files (Aetv6_HC_prot.fa and Aetv6_LC_prot.fa), and two mapping files for gene IDs (Aetv6_HC_id_mapping.txt and Aetv6_LC_id_mapping.txt).
File: al878_dls_45M.cmap
Description: The cmap file is a Bionano genome map (optical map) which was constructed using Bionano’s DLE-1 restriction enzyme. The N50 of the optical maps (contigs) is 45 Mb in size.
Code/software
Plain text formats; can be viewed in text editor.
Changes made since previously published version: Gene IDs were updated, errors in the original annotation files were corrected. CDs and proteins were added.
Annotation of the PacBio HiFi based genome assembly, and whole genome DLE-1 optical maps