Skip to main content

Genetic structure and evolution of diploid Cochlearia in Iceland

Cite this dataset

Brysting, Anne Krag et al. (2022). Genetic structure and evolution of diploid Cochlearia in Iceland [Dataset]. Dryad.


Within the northern European Cochlearia (Brassicaceae), considerable chromosome variation has taken place without corresponding morphological differentiation, which has resulted in an intricate species complex including two base chromosome numbers and several ploidy levels. Here, we dig into the situation in Iceland. The distribution, genetic structure, taxonomy and origin of the two Cochlearia cytotypes (2n = 12 and 2n = 14) present in Iceland are discussed. Chromosome counts indicate that the 2n = 12 populations are dominating along the coast, whereas only 2n = 14 has been reported for inland alpine populations. RADseq data support geographically structured genetic variation along the Icelandic coast, as well as environmentally structured genetic differentiation between coastal and alpine populations. The alpine populations show genetic and morphological affiliation with C. groenlandica (2n = 14), which is widely distributed in the Arctic, but more comprehensive sampling is needed to conclude on the taxonomical status of the Icelandic coastal plants. To uncover the origin of and phylogenomic relationship among the two chromosome variants, comparative whole-genome sequencing should be performed.


Morphology: We measured leaf traits on pressed material from both field-collected plants and plants cultivated in controlled conditions. When possible, we measured five leaves from five samples per population. The following leaf traits, previously recognized by Nordal & Laane (1990) as informative, were measured/calculated: length (L), width (W), leaf ratio (W/L), and leaf base angle (Supporting Information, Fig. S1). For five populations, we measured flower traits on field-collected and cultivated plants. When possible, we measured three flowers from four samples per population. The following flower traits were measured/calculated: petal length (PL), petal width (PW), sepal length (SL), petal ratio (PW/PL) and sepal-petal ratio (SL/PL).

RADseq: We prepared a RADseq library using single digest, double barcoding and size selection with magnetic beads according to a protocol adapted from Baird et al. (2008) and Paun et al. (2016), with modifications as in Brandrud et al. (2017). The library was sequenced using paired-end sequencing (125 bp) in one Illumina HiSeq2500 lane at the Norwegian Sequencing Centre, Oslo, Norway ( Raw Illumina reads were processed with STACKS v. 1.29 (Catchen et al., 2011; 2013). To demultiplex samples and remove low quality data, process_radtags was run with the following settings: PstI as restriction enzyme, removal of any read with an uncalled base, discarding reads with low quality scores, and rescuing barcodes and RADtags. The retained read numbers per sample after demultiplexing ranged from 1,186,173 to 4,753,514. was used to execute ustacks, cstacks and sstacks with the forward reads. Different values for m (minimum number of identical raw reads required to create a stack), M (number of mismatches allowed between loci when processing a single individual) and n (number of mismatches allowed between loci when building the catalogue) were tested to find the combination that maximized the number of reliable loci. The settings used in the end were m = 2, M = 2 and n = 1. From c. 349 million paired-end reads obtained from RADseq, c. 196 million forward reads were retained after demultiplexing and cleaning. After de novo catalog building and SNP calling, we retained c. 12,000 RAD loci present in at least 80% of the 83 individuals. The final structure and phylip files used for downstream data analyses contained 1,500 SNPs present in at least 80% of the individuals in a population, and in at least 70% of the populations.

Usage notes

The file with SNPs derived from RADseq reads is in STRUCTURE format, with two rows per individual. 

Within the excel file, the morphological raw data are divided into four data sheets: Flower cultivated, Flower field, Leaf cultivated, Leaf field.

Explanation of collection numbers used in the data files can be found in Table 1 and Table S3 in the article.

For further explanation of the datasets, see the uploaded ReadMe file.


EEA collaborative grant, Award: EHP-CZ07-MOP-1-1052014

The Nansen Foundation, The Norwegian Academy of Science and Letters