Data from: Genome assembly and annotation of Arabidopsis halleri, a model for heavy metal hyperaccumulation and evolutionary ecology
Briskine, Roman V. et al. (2016), Data from: Genome assembly and annotation of Arabidopsis halleri, a model for heavy metal hyperaccumulation and evolutionary ecology, Dryad, Dataset, https://doi.org/10.5061/dryad.gn4hh
The self-incompatible species Arabidopsis halleri is a close relative of the self-compatible model plant Arabidopsis thaliana. The broad European and Asian distribution and heavy metal hyperaccumulation ability make A. halleri a useful model for ecological genomics studies. We used long-insert mate-pair libraries to improve the genome assembly of the A. halleri ssp. gemmifera Tada mine genotype (W302) collected from a site with high contamination by heavy metals in Japan. After five rounds of forced selfing, heterozygosity was reduced to 0.04%, which facilitated subsequent genome assembly. Our assembly now covers 196 Mb or 78% of the estimated genome size and achieved scaffold N50 length of 712 kb. To validate assembly and annotation, we used synteny of A. halleri Tada mine with a previously published high-quality reference assembly of a closely related species, Arabidopsis lyrata. Further validation of the assembly quality comes from synteny and phylogenetic analysis of the HEAVY METAL ATPASE4 (HMA4) and METAL TOLERANCE PROTEIN1 (MTP1) regions using published sequences from European A. halleri for comparison. Three tandemly duplicated copies of HMA4, key gene involved in cadmium and zinc hyperaccumulation, were assembled on a single scaffold. The assembly will enhance the genomewide studies of A. halleri as well as the allopolyploid Arabidopsis kamchatica derived from A. lyrata and A. halleri.