Skip to main content

Contact zone of two different chloroplast lineages and genetic guidelines for seed transfer in Quercus serrata and Q. crispula

Cite this dataset

Onosato, Kataru et al. (2020). Contact zone of two different chloroplast lineages and genetic guidelines for seed transfer in Quercus serrata and Q. crispula [Dataset]. Dryad.


Within their natural distribution ranges, plant species exhibit genetic structure which has been created by global climate change and natural selection over long periods. To conserve local forests with different genetic structures, genetic guidelines for seed and seedling transfer in individual species are therefore necessary. Genetic guidelines have been published for 43 Japanese tree species using population genetic data; however, for practical use, more detailed genetic borders between important genetic lineages should be clarified to inform seed collection and planting. Thus, we investigated in detail the genetic borders between two important Japanese oak species, Quercus serrata and Q. crispula, in the Chubu region of Japan using chloroplast and nuclear DNA markers, and we discuss the factors that influenced border creation using the results of species distribution modelling (SDM). Two distinct cpDNA haplotypes were found for each species (northern and southern haplotype) within the Chubu region of Japan but the difference in nuclear DNA between northern and southern haplotype populations was very small both in Q. serrata and Q. crispula. The results of SDM showed that during the LGM Q. serrata was distributed mostly along the coastline but Q. crispula was distributed not only along the coast but also in mountainous areas further inland. The cpDNA genetic borders of these two oak species are complex and seem to have been influenced by topography and their distribution during the LGM. We propose and discuss genetic guidelines for these two oak species based on the results of this study.


Chloroplast DNA analysis
DNA was extracted using a DNeasy Plant mini kit (Qiagen, Germany). To amplify the 3’to_rps2 fragment, which distinguishes between the northern and southern type of cpDNA, the following forward and reverse primers were used: 5’-GTCATATATTTGATCCCGCC-3’ and 5’-AACCGGAACTAGTCGGATG-3’ respectively (San Jose-Maldia et al. 2017). The PCR conditions and sequencing procedure were as in San Jose-Maldia et al.(2017). The northern and southern haplotypes for the sequenced fragment are shared between both species. The PCR products were purified using Exo-SAP ITTM (GE Healthcare Limited) and sequenced in both directions using a BigDye Terminator Sequencing Kit (PE Biosystems). The sequencing products were then purified by ethanol precipitation and analyzed in a 3100 Genetic Analyzer(Applied Biosystems). Sequence data were assembled and manually edited using Sequencher 10.4.1 (Gene Codes Corporation).

MIG-seq analysis
Multiplexed inter-simple sequence repeat (ISSR) genotyping by sequencing (MIG-seq; Suyama and Matsuki 2015) was conducted to genotype selected samples. We used 36 and 36 samples of Q. serrata, and 59 and 37 samples of Q. crispula, for the northern and southern haplotypes, respectively. These samples were selected from meshes in which the two cpDNA haplotypes did not coexist in each species. The technical implementation for MIG-seq analysis followed the protocol by Suyama & Matsuki (2015). Based on the protocol, we have constructed a MIG-seq library with two polymerase chain reactions(PCR) and conducted the sequencing. The first PCR is conducted to amplify ISSR regions from genomic DNA with a MIG-seq primer set. The first PCR product is used as the template for the second PCR. The latter is performed to add the complementary sequences that serve as an index for subsequent analyses to the primary PCR products. Fragments in the size range 300-800 bp were isolated using the magnetic bead method and quantified using Real-time PCR. The DNA library was sequenced using an Illumina MiSeq Sequencer platform.
The obtained sequence data were filtered using FASTX_toolkit to remove low-quality and adapter sequences. The quality-filtered reads were used for SNP detection with the “ustack”, “cstack” and “sstack” modules in Stacks v. 2.2, software that identifies loci (Catchen et al. 2011). SNPs were called using the “population” module. Minimum number of populations in which a locus must be present (p) and minimum percentage of individuals in a population required to have a locus (r) were set to 1 and 0.8, respectively. Thus we retained only those loci that were present in ≥ 80% of individuals.

Data analysis of MIG-seq output
The observed heterozygosity (Ho), expected heterozygosity (He), nucleotide diversity (π), inbreeding coefficient (FIS) and genetic differentiation (FST) were estimated using Populations in Stacks (Catchen et al.2013). Allelic richness (AR; El Mousadik & Petit 1996) was estimated using FSTAT 2.9.4 (Goudet 2003). We also carried out STRUCTURE analysis (version 2.3; Pritchard et al. 2000) which is Bayesian cluster analyses to clarify the genetic structure of the two cpDNA groups in the Chubu region. We ran ten replications with each K from 1 to 10 with burn-in length of 100,000 and following 100,000 Markov Chain Monte Carlo (MCMC) iterations. To determine the most appropriate number of K, we plotted both of the log probability(L(K)) and applied the ΔK method of Evanno et al. (2005) using STRUCTURE HARVESTER (Earl and VonHoldt 2012). Then the output results from STRUCTURE and STRUCTURE HARVESTER were aligned using CLUMPP (Jakobsson & Rosenberg 2007) and displayed using DISTRUCT (Rosenberg 2004).