Soil composition, phenotypic and genetic data to: Adaptive differentiation on serpentine soil in diploid versus autotetraploid populations of Biscutella laevigata (Brassicaceae)
Data files
Sep 27, 2023 version files 61.23 KB
-
Bürki_et_al_2023_primary_data.xlsx
59.70 KB
-
README.md
1.52 KB
Oct 12, 2023 version files 1.08 MB
-
Bürki_et_al_2023_genetic_data.zip
1.01 MB
-
Bürki_et_al_2023_primary_data.xlsx
59.70 KB
-
README.md
2.56 KB
Abstract
Serpentine soils exhibit extreme properties (e.g. high magnesium content) influencing plant growth and survival and have been repeatedly documented to promote adaptive edaphic differentiation in plants. Individuals from four pairs of nearby diploid and autotetraploid populations of Biscutella laevigata sampled on serpentine vs non-serpentine soils in a factorial design are used to assess the genetic and phenotypic changes associated with edaphic origin and ploidy level. Individual samples from natural populations were subjected to soil elemental analysis and genotyping using restriction site-associated DNA sequences (RAD-seq) to link genetic variation with contrasting soils and ploidy levels. In diploids, genetic variation was consistent with demographic contraction and a pattern of isolation by environment with respect to the ratio of calcium / magnesium concentrations, whereas tetraploids presented evidence of expansion with limited edaphic differentiation. The genetic basis of tolerance and adaptation to serpentine was further assessed experimentally on seed-grown individuals from all populations subjected to high (serpentine-like) vs low (control) concentrations of magnesium in hydropony. Fitness-related phenotypic traits under experimental cultivation were consistent with adaptive differentiation among diploid ecotypes but not among the tetraploids that similarly grow in both habitats and consistently present higher investment in roots. Further work comparing experimentally resynthesized polyploids to natural diploids and polyploids has to tease the role of whole genome duplication apart from the impact of post-polyploidy evolution.
README: Soil composition, phenotypic and genetic data to: Adaptive differentiation on serpentine soil in diploid versus autotetraploid populations of Biscutella laevigata (Brassicaceae)
https://doi.org/10.5061/dryad.hx3ffbgkj
The dataset contains elemental compositon of soil immediately surrounding Biscutella laevigata individuals growing on eight serpentine/non-serpentine sites in Austria and Switzerland and phenotypic traits measured on B. laevigata individuals in hydroponic cultivation mimicking the specific chemistry of serpentine vs. non-serpentine soils. We also provide the input data used in population genetic analyses and an overview of the raw sequencing reads deposited in the European Nucleotide Archive (ENA) for 63 B. laevigata individuals sampled on the eight sites and genotyped using ddRAD-seq.
Description of the data and file structure
The attached spreadsheet "Bürki_et_al_2023_primary_data.xlsx" consists of the following:
Sheet 1: Elemental composition of soil surrounding diploid (2x) or tetraploid (4x) individuals of Biscutella laevigata sampled in field populations occurring on serpentine (S) or nonserpentine (NS) sites. Concentrations of particular elements are given in parts per million (ppm). At each locality, eight individual samples were included for genetic analyses and elemental concentrations.
Sheet 2: Plant traits and fitness proxies scored on diploid (2x) and tetraploid (4x) individuals of Biscutella laevigata originating either from serpentine (S) or nonserpentine (NS) sites after eight weeks of hydroponic cultivation from seeds in magnesium enriched (Mg+) or control (Mg-) solutions, the former mimicking a specific chemistry of serpentine soils.
The attached archive "Bürki_et_al_2023_genetic_data.zip" consists of the following:
ENA_samples_list.txt: provides ENA project ID, run ID (i.e. raw fastq files), sample ID, and alias for each sample included in the study.
blaevserp_filtered_MD01_pruned02_MAC3MAF05rm.vcf.gz: final filtered vcf file including 1146 biallelic SNPs for 63 samples
1snp_percontig.structure: input dataset for STRUCTURE analysis, obtained by randomly selecting 1 SNP per RAD tag resulting in 944 SNPs
Sharing/Access information
Raw sequencing reads have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under the accession number PRJEB48869:
Code/Software
Methods
Field sampling of material was conducted in proximate pairs of Biscutella laevigata populations growing either on serpentine or non-serpentine soils. These covered four diploid populations (B. laevigata subsp. kerneri Mach.-Laur.) in the Wachau region of Austria and four tetraploid populations (B. laevigata subsp. laevigata L.) in the Swiss Alps around Davos and Arolla. At each site, eight randomly selected individuals growing at a minimum distance of 3 m apart were georeferenced using a GPS receiver and sampled for silica gel-dried leaves, open-pollinated seeds and > 10 g of soil next to the main root.
Soil samples were dried at 65°C for 24h, sieved with a 1 mm-sized mesh, and 10 g of each sample was mixed with 30 mL of 1M ammonia acetate at pH 7. After 2h of shaking, the suspension was filtered and diluted (between 1:100 and 1:1000) with distilled water prior to analyses. Cation exchange capacity was assessed quantitatively by using inductively coupled plasma–optical emission spectrometry (ICP-OES) at the Platform of Analytical Chemistry of the University of Neuchâtel, Switzerland. Elemental concentration of Ca2+, Mg2+, K+, Na+, Ni2+, Cu2+, Zn2+, Cr2+, Al3+ was measured for each of the 64 samples.
We used a hydroponic experiment to mimic the specific chemistry of serpentine vs. non-serpentine soils and compare the corresponding phenotypic responses of plants originating from serpentine vs. non-serpentine sites across two different ploidy cytotypes. Each of the eight plants genotyped per population was represented in the hydroponic experiment by six randomly selected seeds. Half the plants were exposed to low magnesium concentrations (Mg-, 0.5 mM, control setting) whereas the other half were exposed to high magnesium concentrations (Mg+, 5 mM, serpentine-mimicking setting). For details on the hydroponic system and used nutrient solutions see the original article. After eight weeks of hydroponic cultivation, plants were harvested and the following traits were measured: overall number of leaves, length and width of the largest leaf, chlorophyll content and specific leaf area (SLA). The harvested biomass (root systems and leaf rosettes separately) was dried at 80°C for 4 days and weighed. Relative chlorophyll content was obtained with SPAD-502 meter (Konica Minolta Sensing Europe B.V.) by averaging three measurements on a fresh largest leaf. A 0.64 cm2 square punched out of the largest leaf was dried at 80°C and used to estimate the SLA. Additionally, two derived phenotypic characters were estimated: an approximate largest leaf size (i.e., length × width of the largest leaf) and root / shoot ratio (i.e. belowground biomass / aboveground biomass).
ddRAD seq libraries including the eight individuals sampled per population were prepared following a protocol adapted from Peterson et al. (2012) using restriction enzymes EcoRI and MseI. Subsequent ligation of uniquely indexed EcoRI adapters and library-specific MseI adapters, including four degenerate bases to identify PCR duplicates, enabled each sample to be distinguished. Library pools were size-selected, targeting a mean length of 550 bp with AMPure XP beads. DNA fragments containing both EcoRI and MseI adapters were amplified with 12 PCR cycles and the final libraries were obtained after cleaning of the PCR products. Libraries were sequenced as paired-end 2 × 250 bp reads on two Illumina Novaseq 6000 lanes. Raw reads have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB48869 (https://www.ebi.ac.uk/ena/browser/view/PRJEB48869). The correspondence between samples and runs included in this study is provided in the ‘ENA_samples_list.txt’. All tetraploid samples were included twice, as independent samples from the start of the procedure to account for the greater depth required to accurately genotype tetraploids. The process_radtag command implemented in Stacks (Catchen et al. 2013) was used to demultiplex raw fastq reads and remove bad quality bases. PCR duplicates were filtered out with clone_filter. Trimmomatic (Bolger et al. 2014) was used to trim the first four low-quality bases and keep only reads with a minimum length of 100 bp.
The de novo reference catalog of RAD tags was generated based on reads from the 32 diploid samples using the dDocent pipeline (parameters: dDocent Cutoff1 = 5, dDocent Cutoff2 = 5, first clustering rate 80%, second clustering rate 80%; Puritz et al. 2014). Then reads from all 64 samples were mapped to the de novo reference catalog using BWA mem (Li 2013). SNPs were called with the Genome Analysis Toolkit ver. 4.1.0.0 (McKenna et al. 2010), without base recalibration. In a first step, samples were individually genotyped according to their ploidy by local re-assembly of haplotypes with the HaplotypeCaller tool. Single-sample GVCFs were then merged and jointly genotyped using GenomicsDBImport and GenotypeGVCFs tools. The resulting VCF file was first filtered to keep only biallelic SNPs covered in at least 50% of samples with sufficient quality according to GATK best practice (Supporting information). To avoid filters relying on expectations of the Hardy–Weinberg equilibrium that differ among ploidy, positions with an overall depth higher than two standard deviations above the mean were further removed as putative paralogous loci. Individual genotypes were filtered for a minimal depth of three and genotype quality of 20. Filtered genotypes were set to no-call and only SNPs with a maximum of 10% missing data were kept. The dataset was further pruned using PLINK2 (Purcell et al. 2007) based on linkage disequilibrium (–indep-pairwise 1000 100 0.2) and only positions with a minor allele count (MAC) above eight and a minor allele frequency (MAF) above 0.05 were finally kept. The resulting filtered VCF including 1146 biallelic SNPs and the derived input for STRUCTURE, which was obtained by randomly selecting one SNP per radtag, are provided here.