Skip to main content

Genetic differentiation and demographic trajectory of the insular Formosan and Orii’s flying foxes

Cite this dataset

Lin, Kung-Ping et al. (2021). Genetic differentiation and demographic trajectory of the insular Formosan and Orii’s flying foxes [Dataset]. Dryad.


Insular flying foxes are keystone species in island ecosystems due to their critical roles in plant pollination and seed dispersal. These species are vulnerable to population decline because of their small populations and low reproductive rates. The Formosan flying fox (Pteropus dasymallus formosus) is one of the five subspecies of the Ryukyu flying fox. P. d. formosus has suffered from a severe decline and is currently recognized as a critically endangered population in Taiwan. On the contrary, the Orii’s flying fox (P. d. inopinatus) is a relatively stable population inhabiting Okinawa Island. Here, we applied a genomic approach called double digest restriction-site associated DNA sequencing to study these two subspecies for a total of seven individuals. We detected significant genetic structure between the two populations. Despite their contrasting contemporary population sizes, both populations harbor very low degrees of genetic diversity. We further inferred their demographic history based on the joint folded site frequency spectrum and revealed that both P. d. formosus and P. d. inopinatus had maintained small population sizes for a long period of time after their divergence. Recently, these populations experienced distinct trajectories of demographic changes. While P. d. formosus suffered from a drastic ~10-fold population decline not long ago, P. d. inopinatus underwent a ~4.5-fold population expansion. Our results suggest separate conservation management for the two populations—population recovery is urgently needed for P. d. formosus while long-term monitoring for adverse genetic effects should be considered for P. d. inopinatus.


Seven individuals of Pteropus dasymallus bats were collected. Among them, three are P. d. formosus from Taiwan (designated as 178, 617, and 690), while four are P. d. inopinatus from Okinawa Island (designated as R10, R36, R4, and R5). For the three P. d. formosus individuals, frozen muscle tissues were obtained from the Wildlife Cryobank of Taipei Zoo. For each of the four P. d. inopinatus individuals, skin samples were excised from the wing membranes using a biopsy punch (3 mm in diameter); eight punches were collected per individual. All biopsies were preserved in Allprotect Tissue Reagent (Qiagen) and stored at -20˚C. Genomic DNA was extracted from samples using the DNeasy Blood and Tissue kit (Qiagen) following the manufacturer’s instructions.

Double digest restriction-site associated DNA sequencing (ddRADSeq) method was applied to these samples. For a given individual genome, two restriction enzymes, EcoRI and MseI, were used to cleave the DNA. The cleaved DNA fragments were ligated with adaptors that contain barcodes and sequences complementary to the PCR primers. After these ligated fragments were amplified by PCR, fragments with lengths between 250 base pairs (bp) and 450 bp were retained and sent for paired-end sequencing. Sequencing was carried out using the Illumina HiSeqTM 2000 sequencing system in the core facility center of Genomics company in New Taipei City.

The demultiplexed paired-end sequence reads were processed with SOAPnuke2 to discard low-quality reads and unwanted sequences. First, reads with correct barcodes were retained exclusively and their barcodes were trimmed. Reads without correct restriction enzyme cleavage sites were subsequently removed from the dataset. Low-quality reads were also removed if the proportions of base pairs with Phred quality scores lower than 15 were >40% or the proportions of unidentified base pairs were >10%. Next, we removed the restriction enzyme cleavage sites from all remaining reads. The remaining forward reads were trimmed to 87 bp to accommodate the differences in sequence lengths caused by barcode trimmings and restriction enzyme cleavage site removals.

Finally, we assembled the clean reads using Stacks 2.2 with the following assembly parameters: m = 3, M = 1, n = 1. Secondary read incorporations were disabled and the deleveraging algorithm was enabled. Only sites that were sequenced in all individuals were assembled (r100) to prevent systematic biases on population genetics statistics including estimates of nucleotide diversity and Tajima's D. Since SNPs with excessive heterozygosity across all individuals can result from misaligning reads of paralogous or repetitive loci, we applied the Hardy-Weinberg equilibrium (HWE) test to exclude the assembled reads or SNPs that showed significant deviations from HWE (p < 0.05). Two types of heterozygosity filters were applied to the dataset to generate datasets for population genetics analyses that rely on accurate nucleotide diversity estimates or only genetically unlinked SNPs.

Usage notes

RAD loci from all individuals (Pd-pooled_RADloci.fa)
The assembled sequences of all RAD loci (191,263) in each individual are pooled in this FASTA file. These RAD loci have not been filtered by the HWE test but have passed through other quality control steps. Unique RAD loci are characterized by the serial number in the header line (CLocus_XX). Individual ID can also be found in the brackets at the end of each header. If the RAD locus is identified as homozygous with no variants, there will only be one sequence per individual. Otherwise, two alleles (labeled with Allele_0 and Allele_1 in headers) can be found with the same RAD loci and individual labels. Unrecognized nucleotides (N) are included in these sequences.
SNPs after heterozygosity filters (Pd-SNPs-hf*.map and Pd-SNPs-hf*.ped)
The SNPs used in the population genetics analyses in our study are provided in PLINK .map and .ped files. They were filtered with two types of HWE filters. The file suffix -hf implies that SNPs showing significant deviation from HWE were removed from this dataset, while the suffix -hf_loci implies that whole RAD loci were removed if any SNPs with deviation from HWE were present on them. The former dataset was used in analyses that require only SNPs (e.g. STRUCTURE, PCA), while the latter was used in analyses that need the estimates of nucleotide diversities (e.g. demographic inference). The numbers of SNPs provided in the file are 33,005 (Pd-SNPs-hf) and 32,935 (Pd-SNPs-hf_loci).
The lengths of RAD loci (Pd-loci_lengths.tsv)
This tsv file records the lengths of each unique RAD loci, with or without unidentified nucleotides (N). This file was used, along with the above SNP files, in estimating the nucleotide diversity in P. .d formosus and P. d. inopinatus. All loci have passed through the quality controls including the hf_loci filter. The lengths for a total of 188,588 loci are provided in this file.


Forestry Bureau, Council of Agriculture of Taiwan, Award: 107-9.1-SB-17(1)

Forestry Bureau, Council of Agriculture of Taiwan, Award: 108-9.1-SB-30

National Science and Technology Council, Award: MOST 107-2621-B-305-001

Forestry Bureau, Council of Agriculture of Taiwan, Award: 107-9.1-SB-17(1)