HAUpekinT2T: A complete, telomere-to-telomere reference genome and annotation for the Pekin duck (Anas platyrhynchos domesticus)
Data files
Apr 02, 2026 version files 388.93 MB
-
HAUpekinT2T.fasta.gz
370.20 MB
-
HAUpekinT2T.genomic.gff.gz
18.73 MB
-
README.md
704 B
Abstract
Here we present HAUpekinT2T, a complete, telomere-to-telomere (T2T) gapless reference genome assembly for the Pekin duck (Anas platyrhynchos). To overcome the challenges of highly repetitive regions and complex genomic structures, this high-quality genome was assembled using a combination of long-read sequencing (Pacbio HiFi and Nanopore) and Hi-C chromatin conformation capture technologies. The final assembly spans approximately 1.26 Gb with a contig N50 of 79.69 Mb. It successfully achieves gapless assemblies for 41chromosomes, including the complete resolution of complex regions such as centromeres, telomeres, and the sex chromosomes (Z and W).
This dataset includes the final fasta and gff files derived from the duck pan-genome. Please feel free to contact us if you have any questions.
Description of the data and file structure
-
File: HAUpekinT2T.fasta.gz
- Format: FASTA
- Content: Contains 41 completely assembled chromosomes. Sequence IDs represent chromosome numbers/names (e.g., chr1, chr2, chrZ, chrW).
-
File: HAUpekinT2T.genomic.gff.gz
- Format: gff
- Content: Contains structural annotations of protein-coding genes. Standard feature types (gene, mRNA, exon, CDS) are included.
