Analysis of structural and somatic variation in Mexican lime
Data files
Oct 21, 2025 version files 9.91 MB
-
Mxlime_HiFi_sniffles.vcf
254.65 KB
-
Mxlime_HiFi_snps.vcf
9.65 MB
-
README.md
962 B
Abstract
Clonally propagated crops have high levels of heterozygosity both between subgenomes within a somatic cell and between cells within an individual clone. Recent developments in 2nd and 3rd generation sequencing technologies have enabled the identification of this diversity, making it increasingly clear that these sources of diversity are abundant in clonal varieties and can contribute to variation in traits of interest to breeders. Compared with citrus cultivars like Sweet Orange, there are relatively few described accessions for Mexican lime (Citrus x aurantifolia). Given that many of the described varieties of sweet orange have been derived from somatic variants, we were interested in examining this variation in Mexican lime to assess its potential for further development of this cultivar. Using a recently published diploid assembly of Mexican lime and high-coverage PacBio HiFi libraries from leaf tissue of four individuals, we identified multiple large structural variants differing between thorned and thornless lineages, and evidence for mosaicism at hundreds of loci. Many of these variants are found in the promoters and bodies of genes and may act as standing variation for continued improvement of this cultivar.
Dataset DOI: 10.5061/dryad.0gb5mkmf4
Description of the data and file structure
HiFi libraries were aligned to the diploid assembly of Mexican lime (Massaro et al. 2025) using nglmr v0.2.7 (Sedlazeck et al. 2018). Structural variant calling was done using Sniffles2 v2.2 (Smolka et al. 2024) and SNVs were called using BCFtools v1.21 (Danecek et al. 2021).
Files and variables
File: Mxlime_HiFi_sniffles.vcf
Description: Structural variant calls from combined mapping of four libraries to Mxlime_USDA_v1.fa.
File: Mxlime_HiFi_snps.vcf
Description: Short nucleotide variant (SNV) calls from combined mapping of four libraries to Mxlime_USDA_v1.fa.
Access information
Data was derived from the following sources:
- PacBio HiFi libraries are available on the NCBI SRA database linked from the bioproject, PRJNA1137419.
