Skip to main content
Dryad

Chromosome‐level genome assembly of Lethenteron reissneri provides insights into lamprey evolution

Cite this dataset

Ren, Yandong et al. (2020). Chromosome‐level genome assembly of Lethenteron reissneri provides insights into lamprey evolution [Dataset]. Dryad. https://doi.org/10.5061/dryad.r7sqv9s9d

Abstract

The reissner lamprey Lethenteron reissneri belonging to Cyclostomata, serves as a bridge between invertebrates and jawed vertebrates, and is considered the most direct ancestor of vertebrates. However, the genetic mechanisms underlying the adaptive evolution of lampreys remain unclear. Here, we supplied the genome data and annotation data of Lethenteron reissneri. Total 5 files were uploaded, including the assembled genome of Lethenteron reissneri, the gene annotation file in gff format, the gene function annotation file, the LIP gene sequences of 50 species used in the article, and the readme file. This study not only provides the first chromosome-level reference genome in Cyclostomata, but also indicates the unique biology and adaptive evolution of lampreys.

Methods

A male reissner lamprey (L. reissneri) was collected from the Taizi River, Fushun City, Liaoning Province, China for genome and transcriptome sequencing. To obtain sufficient high-quality DNA and avoid programmed genome rearrangement (PGR) in somatic cells, the fresh testis tissue of the reissner lamprey was ground into a powder in liquid nitrogen to extract the DNA using a Blood & Cell Culture DNA Mini kit (Qiagen, Hilden, Germany). it was then prepared for PacBio (Menlo Park, CA, USA), Illumina HiSeq X-ten (libraries with short insert sizes of 350 bp for 2 x 150 bp paired-end sequencing) (San Diego, CA, USA), and Hi-C sequencing. 

To fully understand the genome size and other characteristics of the reissner lamprey genome, we performed genome survey analysis using the ~113.60-Gb short-insert-size clean reads produced by the Illumina platform. The results indicated that the reissner lamprey genome was ~1.15 Gb and had a high ratio of heterozygosity and repetitive sequences, suggesting that the genome is complex. To acquire a high-quality genome assembly, ~102.19 Gb (~96.13-fold coverage of the estimated genome size) PacBio long reads (reads number: 9,738,768; N50: ~17 Kb) were produced and used for genome assembly using FALCON software (v0.7). To further improve the quality and accuracy of the assembled genome, we corrected the genome using QUIVER with default parameters. We obtained a ~1.06-Gb genome assembly with the contig N50 of ~2.07 Mb. The assembled genome size was close to the k-mer estimated genome size (~1.15 Gb), which covered ~92.17% of the evaluated genome. To obtain the chromosome-level assembly, 176.14 Gb Hi-C reads were produced for chromosome construction using 3D de novo assembly software. A total of 72 chromosomes were anchored, with a mounting rate of up to 90.29%, and with a scaffold N50 of 13.23 Mb, indicating successful acquisition of the chromosome-level reference genome of reissner lamprey. The annotation step and other method were performaed using the method described in this article. The raw genome sequencing data and RNA-seq data were deposited in the National Center for Biotechnology Information database with the accession number PRJNA558325.

Funding

National Natural Science Foundation of China, Award: 31772884

National Natural Science Foundation of China, Award: 31601044

Chinese Academy of Sciences, Award: XDB13000000