Skip to main content
Dryad

Population analysis of retrotransposons in giraffe genomes supports RTE decline and widespread LINE1 activity in Giraffidae

Cite this dataset

Petersen, Malte et al. (2022). Population analysis of retrotransposons in giraffe genomes supports RTE decline and widespread LINE1 activity in Giraffidae [Dataset]. Dryad. https://doi.org/10.5061/dryad.ksn02v74f

Abstract

The majority of structural variation in genomes is caused by insertions of transposable elements (TEs). In mammalian genomes, the main TE fraction is made up of autonomous and non-autonomous non-LTR retrotransposons commonly known as LINEs and SINEs (Long and Short Interspersed Nuclear Elements). Here we present one of the first population-level analysis of TE insertions in a non-model organism, the giraffe. Giraffes are ruminant artiodactyls, one of the few mammalian groups with genomes that are colonized by putatively active LINEs of two different clades of non-LTR retrotransposons, namely the LINE1 and RTE/BovB LINEs as well as their associated SINEs. We analyzed TE insertions of both types, and their associated SINEs in three giraffe genome assemblies, as well as across a population level sampling of 48 individuals covering all extant giraffe species. Results The comparative genome screen identified 139,525 recent LINE1 and RTE insertions in the sampled giraffe population. The analysis revealed a drastically reduced RTE activity in giraffes, whereas LINE1 is still actively propagating in the genomes of extant (sub)-species. In concert with the extremely low activity of the giraffe RTE, we also found that RTE-dependent SINEs, namely Bov-tA and Bov-A2, have been virtually immobile in the last 2 million years. Despite the high current activity of the giraffe LINE1, we did not find evidence for the presence of currently active LINE1-dependent SINEs. TE insertion heterozygosity rates differ among the different (sub)-species, likely due to divergent population histories. Conclusions The horizontally transferred RTE/BovB and its derived SINEs appear to be close to inactivation and subsequent extinction in the genomes of extant giraffe species. This is the first time that the decline of a TE family has been meticulously analyzed from a population genetics perspective. Our study shows how detailed information about past and present TE activity can be obtained by analyzing large-scale population-level genomic data sets.

Methods

Sampling and sequencing

Whole genome shotgun short read sequencing data of 48 individuals covering all four giraffe species and seven subspecies from Coimbra et al. (2021) was used for TE analysis using MELT (Gardner et al. 2017). The northern giraffe (G. camelopardalis) is represented by 15 individuals, including its three subspecies: the Nubian (G. c. camelopardalis), the Kordofan (G. c. antiquorum), and the West African giraffe (G. c. peralta). The reticulated giraffe (G. reticulata) is represented by ten individuals. The Masai giraffe sensu lato (G. tippelskirchi) is represented by 12 individuals, including its two subspecies: the Luangwa (G. t. thornicrofti) and the Masai giraffe sensu stricto (G. t. tippelskirchi). Finally, the southern giraffe (G. giraffa) is represented by 11 individuals from its two subspecies: the Angolan (G. g. angolensis) and the South African giraffe (G. g. giraffa). The read FASTQ files are available on SRA from the Coimbra et al. (2021) publication BioProject PRJNA635165.

Quality control of short-reads:

Mapping:

  • Reference genome: Kordofan giraffe (accession: ASM1828223v1)
  • BWA-MEM version 0.7.17-r1188
  • BAM sorting with Samtools version 1.9
  • Marked duplicates with MarkDuplicates tool from Picard version 2.18.21

The mapped BAM files from the 48 giraffe individuals have a mean coverage of 19.5X (7-31X) and a mean insert size of 310 bp (247-515 bp).

TE annotation in genome assemblies of Kordofan giraffe, okapi and cattle:

  • RepeatMasker with curated giraffe-specific repeat library
  • Annotation tracks and consensus sequences for LINE1v3 and RTEv3 used to create MEI files for MELT (see below)

TE insertion calling and analysis:

Usage notes

Refer to the included README file for usage instructions.

Funding

Hessian Ministry for Science and the Arts, Award: LOEWE-TBG

Landes-Offensive zur Entwicklung Wissenschaftlich-ökonomischer Exzellenz

Senckenberg Gesellschaft für Naturforschung