Skip to main content

Diploid and haploid assemblies and annotation of monoecious shrub willow (Salix purpurea), clone ID 94003


Hyden, Brennan; Smart, Lawrence (2023), Diploid and haploid assemblies and annotation of monoecious shrub willow (Salix purpurea), clone ID 94003, Dryad, Dataset,


The Salicaceae are dioecious perennials that utilize different sex determination systems. There is substantial interest in understanding the impacts of hybridization, speciation, and polyploidy on sex chromosome evolution. Here, a rare monoecious S. purpurea genotype, 94003, was assembled. Based on sequence alignments to dioecious and monoecious genomes, a 1.15 Mb sex-linked region on Chr15W was identified as absent in monecious plants. Inheritance of this structural variation is responsible for the loss of a male-suppressing function in what would otherwise be genetic females, resulting in monoecy or lethality, if homozygous.


High molecular weight DNA was extracted from 94003 using a protocol described by Mayjonade et al., (2016) followed by DNA clean up in 500 μL 24:1 chloroform:isoamyl alcohol, centrifugation at 8,000 x g, and ethanol precipitation. Resulting DNA had a peak at 15,509 bp as determined by analysis on a femto pulse (Agilent). PacBio Sequencing was performed at the Mount Sinai Research Institute (New York, NY) to generate approximately 2.5 million HiFi reads and 140x HiFi read coverage. HiFi reads were assembled using hifiasm (Cheng et al., 2021) and aligned using minimap2 (Li, 2018) to the S. purpurea 94006 v.5.1 reference genome to scaffold contigs (Goodstein et al., 2011; Zhou et al., 2020). Annotation was performed using Maker (Cantarel et al., 2008) with snap ab initio gene prediction. Additional high-quality DNA for Illumina sequencing was extracted using a Qiagen (Germantown, MD) DNeasy kit and paired end (2 x 150 bp) Illumina sequencing was performed at West Virginia University (Morgantown, WV).

Usage notes

All files can be viewed on any standard text editor (e.g. Atom, notepad, textedit, sublime, etc.) IGV can be used the view the genome and annotation in browser form with the assembly and gff3 files. 


National Science Foundation, Award: 542486

National Science Foundation, Award: 1542599

National Science Foundation, Award: 1542509

National Institute of Food and Agriculture, Award: 2021-67034-35116