A chromosomal-scale reference genome of the New World Screwworm, Cochliomyia hominivorax
Data files
Oct 05, 2022 version files 618.27 MB
-
chom_annot_v4.gff
74.95 MB
-
cochliomyia_hominivorax_HiC_v3_masked.fasta
543.32 MB
-
README.txt
559 B
Abstract
The New World Screwworm, Cochliomyia hominivorax (Calliphoridae), is the most important myiasis-causing species in America. Screwworm myiasis is a zoonosis that can cause severe lesions in livestock, domesticated and wild animals, and occasionally in people. Beyond the sanitary problems associated with this species, these infestations negatively impact economic sectors, such as the cattle industry.
Here, we present a chromosome-scale assembly of C. hominivorax's genome, organized in 6 chromosome-length and 515 unplaced scaffolds spanning 534 Mb. There was a clear correspondence between the D. melanogaster linkage groups A-E and the chromosomal-scale scaffolds. Chromosome Quotient (CQ) analysis identified a single scaffold from the X chromosome that contains most of the orthologs of genes that are on the D. melanogaster fourth chromosome (linkage group F or dot chromosome). CQ analysis also identified potential X and Y unplaced scaffolds and genes. Y-linkage for selected regions was confirmed by PCR with male and female DNA. Some of the long chromosome-scale scaffolds include Y-linked sequences, suggesting misassembly of these regions. These resources will provide a basis for future studies aiming at understanding the biology and evolution of this devastating obligate parasite.
The input de novo assembly, Chicago library reads, and Dovetail HiC library reads were used as input data for HiRise, a software pipeline designed specifically for using proximity ligation data to scaffold genome assemblies. An iterative analysis was conducted. First, Chicago library sequences were aligned to the draft input assembly using a modified SNAP read mapper (http://snap.cs.berkeley.edu). The separations of Chicago read pairs mapped within draft scaffolds were analyzed by HiRise to produce a likelihood model for genomic distance between read pairs, and the model was used to identify and break putative misjoins, to score prospective joins, and make joins above a threshold. After aligning and scaffolding Chicago data, Dovetail HiC library sequences were aligned and scaffolded following the same method.
The genes predicted in our previous assembly using BRAKER (https://www.nature.com/articles/s42003-020-01152-4) were lifted-over to the current assembly using the pipeline ‘flo’.
Files open with a text editor.