Skip to main content

Cis-regulatory variation in the shavenbaby gene underlies intraspecific phenotypic variation, mirroring interspecific divergence in the same trait

Cite this dataset

Hasson, Esteban et al. (2020). Cis-regulatory variation in the shavenbaby gene underlies intraspecific phenotypic variation, mirroring interspecific divergence in the same trait [Dataset]. Dryad.


Despite considerable progress in recent decades in dissecting the genetic causes of natural morphological variation, there is limited understanding of how variation within species ultimately contributes to species differences. We have studied patterning of the non-sensory hairs, commonly known as “trichomes,” on the dorsal cuticle of first-instar larvae of Drosophila. Most Drosophila species produce a dense lawn of trichomes, but a subset of dorsal trichomes were lost in D. sechellia and D. ezoana due entirely to regulatory evolution of the shavenbaby (svb) gene. Here we describe intraspecific variation in dorsal trichome patterns of first-instar larvae of D. virilis that is similar to the trichome pattern variation identified previously between species. We found that 67% of this difference is explained by a QTL that contains svb and that svb expression correlates with trichome variation within D. virilis . Despite using an experimental design with reasonable power to detect a second locus accounting for the remainder of the variance in trichome number, only a single QTL could be mapped, suggesting that multiple other loci each make a small contribution to trichome patterning. Thus, the genetic architecture of intraspecific variation and interspecific differences exhibit similarities and differences that may reflect differences between short-term and long-term evolutionary processes.


The order of the existing D. virilis genome (r1.2) scaffolds on Muller arms was proposed by Schaeffer et al. (Schaeffer et al. 2008). Therefore, to simplify presentation of QTL results, we concatenated scaffolds following the assignments provided in Supplementary Table 24 of (Schaeffer et al. 2008) to generate a Muller-element pseudo-assembly. We then used recomninant inbred lines to estimate ancestry using MSG (Andolfatto et al. 2011) and calculated linkage scores (r2) between neighboring markers in this pseudo-assembly. We identified several locations where neighboring chromosomal regions did not appear to be co-inherited, most likely resulting from incorrect genome assembly,  estimated the locations of these incorrect assembly locations and re-joined fragments at the ends showing higher linkage to generate the following genome:


The reference pseudo-assembly was updated with sequencing reads (SAMN16729993) from the Mexico D. virilis line (15010–1051.48) using methods described previously (Andolfatto et al. 2011) to generate the following genome:


We used these two genomes as parental genomes to estimate ancestry for 96 recombinant inbred lines (SAMN16729994-SAMN16730089). The full ancestry files for all 96 lines are provided in two files: 



These ancestry probability files were thinned using ( with the configuration file pt.cfg to generate the following thinned ancestry files that can be directly imported with virpheno.csv into Rqtl using read.cross.msg (