Skip to main content
Dryad

Great hammerhead and shortfin mako gene annotation files

Data files

Dec 07, 2022 version files 217.71 MB

Abstract

These preliminary annotation gff files of great hammerhead and shortfin mako chromosome-level genomes are presented here, in hopes of being an asset to the shark genomics community. These annotations were not used for analysis in our associated paper which focused on conservation genomic analyses involving levels of heterozygosity and demographic history of these two species. These annotations should be regarded as preliminary because of the paucity of quality tissue samples that were available to us, from these endangered species, to acquire multiple sets of transcriptomic data. As a result, this hammerhead annotation identifies genes using ab initio predictions from the Maker2 pipeline (employing Augustus and EvidenceModeler) combined with a single tissue transcriptome for training the Augustus gene model (see Stanhope et al. for further details), and evidence-based predictions for mako, using Stringtie2, employing the limited RNA-seq available. The majority of the genes reported here could nonetheless prove beneficial for a variety of comparative analyses, however, the genomes should not be considered thoroughly annotated and the full gene structure should be validated prior to use for targeted analyses. The entire mako annotation is presented (27,804 protein coding genes). A subset of the hammerhead ab initio gene prediction pipeline is presented, representing a total of 26,110 protein coding genes, that were supported by blast evidence in the hammerhead heart transcriptome sequences. Additionally, the gff file contains repeats that were soft masked using RepeatMasker prior to annotation per the methods in Stanhope et al.