Skip to main content

Local climate adaptation and gene flow in the native range of two co-occurring fruit moths with contrasting invasiveness

Cite this dataset

Wei, Shu-Jun; Cao, Li-Jun (2021). Local climate adaptation and gene flow in the native range of two co-occurring fruit moths with contrasting invasiveness [Dataset]. Dryad.


Invasive species pose increasing threats to global biodiversity and ecosystems. While previous studies have characterized successful invaders from ecological traits, characteristics related to evolutionary processes have rarely been investigated. Here we compared gene flow and local adaptation using demographic analyses and outlier tests in two co-occurring moth pests across their common native range of China, one of which (the peach fruit moth, Carposina sasakii) has maintained its native distribution, while the other (the oriental fruit moth, Grapholita molesta) has expanded its range globally during the past century. We found that both species showed a pattern of genetic differentiation and an evolutionary history consistent with a common southwestern origin and northward expansion in their native range. However, for the noninvasive species, genetic differentiation was closely aligned with the environment, and there was a relatively low level of gene flow, whereas in the invasive species, genetic differentiation was associated with geography. Genome scans indicated stronger patterns of climate-associated loci in the noninvasive species. While strong local adaptation and reduced gene flow across its native range may have decreased the invasiveness of C. sasakii, this requires further validation with additional comparisons of invasive and non-invasive species across their native range.


Sample collection and library construction

Larvae of the OFM were sampled from 19 geographical locations from peach shoots, while larvae of the PFM were sampled from 12 locations from multiple hosts, both across their native range of China (Table S1). In total, 358 OFM and 220 PFM were used for genotyping. The genotyped individuals from the same orchard were collected from different trees separated by a distance of at least five meters.

Genomic DNA was extracted from a segment of an individual larva using DNeasy Blood and Tissue Kit (Qiagen, Germany). ddRAD libraries were prepared following a protocol from Peterson et al. (2012). Briefly, 120 ng of extracted genomic DNA from each sample was digested by the restriction enzymes NlaIII and AciI (New England Biolabs, USA), and then ligated to adapters barcoded with a combinatorial index. Each population was discriminated by a 6-bp index and each sample was discriminated by a unique 5 bp barcodes. Uniquely barcoded samples were pooled into multiplexed libraries. Size selection of fragments between 350-450 bp was performed by using BluePippin in 2% gel cassette (Sage Sciences, USA). The pooled libraries were enriched with 12 PCR amplification cycles and sequenced on Illumina HiSeq 4000 platform to obtain 150-bp paired-end reads, at BerryGenomics Company (Beijing, China).

SNP calling and data filtering

The raw sequencing reads were demultiplexed and trimmed using process_radtags within Stacks v2.3 (Catchen et al. 2013; Catchen et al. 2011). Low quality reads with a Phred score below 20, as well as any reads with uncalled bases were removed. The remaining paired-end reads were aligned to reference genomes of each species (GenBank accessions: CP053148-CP053179 for PFM, CP053120-CP053147 for OFM) using Bowtie v2.3.5 (Langmead & Salzberg 2012). SNPs were called using a maximum likelihood statistical model implemented in pipeline in Stacks. The exported loci were present in all populations and in at least 75% individuals per population. SNPs were further filtered using the R package vcfR (Knaus & Grünwald 2017) and VCFtools v0.1.16 (Danecek et al. 2011) with the following criteria: SNPs with a sequencing depth lower than 4 and higher than 500 were removed; individuals and SNPs with a missing rate higher than 20% were removed; only SNPs with a minimum minor allele count of 2 were retained.

Siblings within populations were identified by calculating Loiselle’s K using SPAGeDi (Goldberg & Waits 2010; Hardy & Vekemans 2002). For each putative full-sibling (K > 0.1875) group, one individual with the minimum percentage of missing SNPs was kept, resulting the raw dataset for subsequent analysis.

Bottlenecks and selective seeps can cause different linkage disequilibrium (LD) profiles in different populations. Linkage disequilibrium (LD) decay in each population was calculated using PopLDdecay with default parameters (Zhang et al. 2019). In order to reduce the effects of linage LD, the raw dataset (with minor allele frequency > 0.05) was thinned by a distance of 20,000 bp using VCFtools v0.1.16 (Danecek et al. 2011). Then, SNPs potentially under selection (see the following sections) were excluded to generate a neutral dataset for analysis of population processes.

Usage notes

File name Content
Admixture.txt Commands for Admixture analysis
baypass.workflow.R R script for Baypass analysis VCF file of PFM
gm.ind306.4-200gm.ind306.4-200x.mac2.177951snp.vcf.zipx.mac2.17795... VCF file of OFM
gradient_forest_Scripts.R R script for Gradient forest analysis
lfmm.R R script for lfmm analysis
PCA_principle_componePCA_principle_component_analysis.Rnt_ana... Script for PCA analysis
plot_LDdecay_from_VCF.txt Script for Lddecay analysis
RAD_qualitRAD_quality_control.update-20200427.Ry_control.update-... R script for quality control of SNPs
redundancy_analyses.R R script of Redundacy analysis (RDA)


National Natural Science Foundation of China, Award: 32070464

Joint Laboratory of Pest Control Research Between China and Australia, Award: Z201100008320013

National Natural Science Foundation of China, Award: 31901884

Joint Laboratory of Pest Control Research Between China and Australia, Award: Z201100008320013