Skip to main content

Phylogeographic and demographic modelling analyses of the multiple origins of the rheophytic goldenrod Solidago yokusaiana

Cite this dataset

Kyan, Ryuuta et al. (2021). Phylogeographic and demographic modelling analyses of the multiple origins of the rheophytic goldenrod Solidago yokusaiana [Dataset]. Dryad.


Understanding adaptation mechanisms is important in evolutionary biology. Parallel adaptation provides good opportunities to investigate adaptive evolution. To confirm parallel adaptation, it is effective to examine whether the phenotypic similarity has one or multiple origins and to use demographic modelling to consider the gene flow between ecotypes. Solidago yokusaiana is a rheophyte endemic to the Japanese Archipelago that diverged from Solidago virgaurea. This study examined the parallel origins of S. yokusaiana by distinguishing between multiple and single origins and subsequent gene flow. The haplotypes of non-coding chloroplast DNA and genotypes at 14 nuclear simple sequence repeat (nSSR) loci and single nucleotide polymorphisms (SNPs) revealed by double-digest restriction-associated DNA sequencing (ddRADseq) were used for phylogeographic analysis; the SNPs were also used to model population demographics. Some chloroplast haplotypes were common to S. yokusaiana and its ancestor S. virgaurea. Also, the population genetic structures revealed by nSSR and SNPs did not correspond to the taxonomic species. The demographic modelling supported the multiple origins of S. yokusaiana in at least four districts and rejected a single origin with ongoing gene flow between the two species, implying that S. yokusaiana independently and repeatedly adapted to frequently flooding riversides.


A ddRADseq library (Peterson et al., 2012) was prepared for S. yokusaiana (87 individuals, 12 populations) and S. virgaurea (63 individuals, 14 populations) (Table S1) using a modification of Peterson’s protocol (Sakaguchi, Kimura, et al., 2018). The library was sequenced with 51-bp single-end reads in one lane of an Illumina HiSeq2000 (Illumina, San Diego, CA). The primer sequences used in this study are described in Sakaguchi et al. (2015).

Trimming adapters, other Illumina-specific sequences, and low-quality regions (Phred quality scores <20) were decided using Trimmomatic (Bolger et al., 2014). De novo assembly was performed using Stacks (Catchen et al., 2011; Catchen et al., 2013) with a minimum rate of individuals (r) of 0.3 and a minimum stack depth (m) of 8. The SNPs with minor allele counts <2, which excludes singletons, a missing genotype rate of SNPs (geno) >0.3, a missing genotype rate of individual (mind) >0.4, and significant deviation from Hardy–Weinberg equilibrium (P <0.00001) were filtered out and leave other options default using plink 1.90b6.21 (Purcell et al., 2007; and selected one SNP per read. To make the observed minor allele site frequency spectrum (SFS), apart from the population genetic analyses, we removed individuals with a relatively high missing genotype rate and filtered out SNPs with a minor allele counts <1, missing genotype rates of SNPs (geno) >0.0, missing genotype rates of individuals (mind) >1.0, and significant deviation from Hardy–Weinberg equilibrium (P <0.01) and leave other options default using plink 1.90b6.21 and selected one SNP per read towards raw data processed by Stacks.

Usage notes

2321SNP_26pop_98ind.vcf is data set used for phylogenetic analyses and contain 2321 SNPs from 98 samples, 26 populations (genotyping rate 68%).

Six MSFS.obs files are the minor allele site frequency spectrum, which are data sets used for four population demographic modelling by fastsimcoal2 (Excoffier et al., 2013) (Tohoku-Chubu: 425 SNPs, Tohoku-Chugoku: 577 SNPs, Tohoku-Okinawa: 476 SNPs, Chubu-Chugoku: 387 SNPs, Chubu-Okinawa: 347 SNPs, Chugoku-Okinawa: 511 SNPs). Prefixes of file name are combinaitons of districts where the populations locate.

Sampling locations for all populations are shown in sampling_locations.kml.


Japan Society for the Promotion of Science, Award: 15K07186, 16H04827, 26271054, and 24657059