Skip to main content
Dryad

Data from: Refinement of the Antarctic fur seal (Arctocephalus gazella) reference genome increases continuity and completeness

Data files

Abstract

The Antarctic fur seal (Arctocephalus gazella) is an important top predator and indicator of the health of the Southern Ocean ecosystem. Although abundant, this species narrowly escaped extinction due to historical sealing and is currently declining as a consequence of climate change. Genomic tools are essential for understanding these anthropogenic impacts and for predicting long-term viability. However, the current reference genome (“arcGaz3”) shows considerable room for improvement in terms of both completeness and contiguity. We therefore combined PacBio sequencing, haplotype aware HiRise assembly and scaffolding based on Hi-C information to generate a refined assembly of the Antarctic fur seal reference genome (“arcGaz4_h1”). The new assembly is 2.53 Gb long, has a scaffold N50 of 55.6 Mb and includes 18 chromosome-sized scaffolds, which correspond to the 18 chromosomes expected in otariids. Genome completeness is greatly improved, with 23,408 annotated genes and a Benchmarking Universal Single-Copy Orthologs (BUSCO) score raised from 84.7% to 95.2%. We furthermore included the new genome in a reference-free alignment of the genomes of eleven pinniped species to characterize evolutionary conservation across the Pinnipedia using genome-wide genomic evolutionary rate profiling (GERP). We then implemented gene ontology (GO) enrichment analyses to identify biological processes associated with those genes showing the highest levels of either conservation or differentiation between the two major pinniped families, Otariidae and Phocidae. We show that processes linked to neuronal development, the circulatory system and osmoregulation are overrepresented both in conserved as well as in differentiated regions of the genome.