Discordant phylogeographic patterns in ecologically similar sympatric sister species: Revisiting the null hypothesis of comparative phylogeography
Data files
Dec 22, 2025 version files 438.37 MB
-
aligned_mitochondrial_sequences.tar.gz
19.86 KB
-
demographic_modeling.tar.gz
53.44 KB
-
genotyping_data_of_ddRADseq.tar.gz
208.31 MB
-
pseudoreference_sequence_of_Cgulosus.tar.gz
229.47 MB
-
raw_phylogenetic_trees.tar.gz
356.39 KB
-
README.md
24.31 KB
-
scripts.tar.gz
137.20 KB
Abstract
Unravelling the complex factors underlying the current geographic patterns of biodiversity is a fundamental goal of biogeography and phylogeography. Comparative phylogeographic studies of co-distributed species have often addressed this issue by attributing observed differences in population structures to differences in interpretable traits between species. However, this approach implicitly relies on the largely untested assumption that species sharing similar ecological, spatial, and phylogenetic contexts should exhibit similar population structures. Herein, we revisited this null hypothesis of phylogeography by comparing the population structures of two sympatric sister species with high ecological similarity using high-throughput genomic data. With extensive sampling and advanced genomic analyses, we revealed the fine-scale population structure of Chaenogobius gulosus, enabling a direct comparison with the well-documented phylogeography of C. annularis. Our findings indicate that the origin of intraspecific lineages in C. gulosus is estimated to be four to five times younger than in C. annularis, with a distinct number of intraspecific lineages. These findings demonstrate that the phylogeographic origin can differ even between sympatric sister species with high ecological similarity, providing a counterexample to the null phylogeographic hypothesis. Our results highlight the role of subtle ecological differences or stochasticity in shaping geographical patterns of biodiversity.
https://doi.org/10.5061/dryad.t4b8gtjcb
Description of the data and file structure
The resources in this link encompass scripts, genotyping data, and analysis results used to examine the phylogeographic structure of two closely related goby species, Chaenogobius gulosus and C. annularis. These resources are based on mitochondrial DNA and double-digested restriction-site associated sequence (ddRAD-seq) data for both species collected from multiple locations in Japan.
Brief description of the data and file structure
scripts.tar.gz
The scripts used in this study (bash, python, perl, R).
These scripts are categorized into the following 4 contents, which are hierarchized within each directory.
01.mtDNA analysis
02.pseudoreference generation
03.ddRADseq genotyping
04.ddRADseq analysis
The detailed hierarchical structure is given below in the section "Detailed description of the file structure".
aligned_mitochondrial_sequences.tar.gz
The compressed files of the directory containing the aligned FASTA files of the mitochondrial sequence used in this study
pseudoreference_sequence_of_Cgulosus.tar.gz
The compressed files of the directory containing the FASTA file for the pseudoreference sequence of Chaenogobius gulosus generated in this study. This sequence was created by incorporating the single-nucleotide variant detected in one whole genome sequence of C. gulosus into the reference for C. annularis (GCA_015082035.1)
The script used to generate this pseudoreference sequence is stored in "/02pseudoreference_generation/" in the scripts.tar.gz.
genotyping_data_of_ddRADseq.tar.gz
The compressed files of a directory containing several genotyping data (19 VCF files) were generated in this study.
The script used to generate these data is stored in "/03ddRADseq_genotyping/" in the scripts.tar.gz.
demographic_modeling.tar.gz
The compressed file of a directory containing the results of the demographic modeling (distribution of AIC for each model, maximum likelihood parameters for the best model, results of bootstrap) and the input site frequency spectrum.
The scripts used to perform the demographic modeling are stored in "/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2" in the scripts.tar.gz.
raw_phylogenetic_trees.tar.gz
The compressed file of a directory containing the raw data of all phylogenetic trees generated in this study.
Some of the scripts used to generate these data are stored in "01mtDNA_analysis/02Bayesian_phylogenetic_tree_by_BEAST", and "04ddRADseq_analysis/06RAxML/01rooted_ML_tree_on_dataset6" in the scripts.tar.gz.
Detailed description of the file structure
"scripts.tar.gz"
01mtDNA_analysis/: Scripts used for mitochondrial DNA (mtDNA) analysis
- 01two_cluster_analysis_by_LINTRE/: Two cluster analysis by LINTRE to test the difference in branch length between species in the mitochondrial trees
- 02Bayesian_phylogenetic_tree_by_BEAST/: Bayesian phylogenetic analyses to provide time-calibrated trees
- 01input_preparation: Preparing the Nexus files that were used as inputs for xml generation in BEAUTi
- 02run_example_cytb_ND2: Executing a single run in BEAST (for cyt b & ND2 concatenated sequences). This study actually performed four independent runs with the same commands.
- 03logcombine_treeannotator_cytb_ND2: Summarizing the four independent runs of BEAST and providing one MCC tree (for cytb & ND2 concatenated sequences).
- 04run_example_cytb_only: Executing a single run in BEAST (for cyt b sequences). This study actually performed four independent runs with the same commands.
- 03logcombine_treeannotator_cytb_only: Summarizing the four independent runs of BEAST and providing one MCC tree (for cytb sequences).
02pseudoreference_generation/: Scripts used for generating the pseudoreference sequence for C. gulosus
- 01mapping/: Mapping of whole genome sequence data of C. gulosus with bwa-mem
- 02sambamba/: Filtering of the BAM file by sambamba (retaining only uniquely mapped reads)
- 03MarkDuplicate/: Removing PCR duplicates in the BAM file by MarkDuplicate (GATK4)
- 04SNPcall_by_GATK/: Genotyping with GATK4
- 01HaplotypeCaller/: Script for GATK HaplotypeCaller process
- 02GenomicsDBImport_GenotypeGVCFs/: Scripts for Genomics DB Import and Genotype GVCF processes
- 05filtering/: Filtering of VCF file with vcftools
- 06consensus/: Generating pseudoreference sequence for C. gulosus by incorporating single-nucleotide variants detected in C. gulosus sample to the original C. annularis reference genome (GCA_015082035.1)
03ddRADseq_genotyping/: Scripts used for genotyping double digest restriction-site associated DNA (ddRAD-seq) data
- 01fastp/: Read filtering with fastp
- 02genotyping_based_on_Cannularis_reference/: Genotyping with the original C. annularis reference genome
- 01mapping/: Mapping with bwa-mem
- 02sambamba/: Filtering of BAM files by sambamba (retaining only uniquely mapped reads)
- 03SNPcall_by_mpileup/: Genotyping with bcftools mpileup / bcftools call
- 01pop1_example/: Example script for joint calling of one target population. In this study, the same script was run independently for each population, and the outputs were used in the next /04merge_filtering/ step.
- 04merge_filtering/: Merging and filtering of VCF files to create the final genotype datasets
- 01dataset12/: Merging and filtering scripts for ddRADseq dataset 12
- 01filtering/: Filtering script for ddRADseq dataset 12
- 02dataset13/: Merging and filtering scripts for ddRADseq dataset 13
- 01filtering/: Filtering script for ddRADseq dataset 13
- 03dataset14_15/: Merging and filtering scripts for ddRADseq datasets 14 & 15
- 01filtering_dataset14/: Filtering script for ddRADseq dataset 14
- 02filtering_dataset15/: Filtering script for ddRADseq dataset 15
- 04dataset16_17/: Merging and filtering scripts for ddRADseq datasets 16 & 17
- 01filtering_dataset16/: Filtering script for ddRADseq dataset 16
- 02filtering_dataset17/: Filtering script for ddRADseq dataset 17
- 05dataset18/: Merging and filtering scripts for ddRADseq dataset 18
- 01filtering_dataset18/: Filtering script for ddRADseq dataset 18
- 06dataset19/: Merging and filtering scripts for ddRADseq dataset 19
- 01filtering_dataset19/: Filtering script for ddRADseq dataset 19
- 01dataset12/: Merging and filtering scripts for ddRADseq dataset 12
- 03genotyping_based_on_pseudoreference/: Genotyping with the pseudoreference genome for C. gulosus
- 01mapping/: Mapping with bwa-mem
- 02sambamba/: Filtering of BAM files by sambamba (retaining only uniquely mapped reads)
- 03SNPcall_by_mpileup/: Genotyping with bcftools mpileup / bcftools call
- 01pop1_example/: Example script for joint calling of one target population. In this study, the same script was run independently for each population, and the outputs were used in the next /04merge_filtering/ step.
- 04merge_filtering/: Merging and filtering of VCF files to create the final genotype datasets
- 01dataset1_2*_*3/: Merging and filtering scripts for ddRADseq datasets 1, 2, & 3
- 01filtering_dataset1_2/: Filtering script for ddRADseq dataet 1 & 2
- 02filtering_dataset3/: Filtering script for ddRADseq dataset 3
- 02dataset4/: Merging and filtering scripts for ddRADseq dataset 4
- 01filtering/: Filtering script for ddRADseq dataset 4
- 03dataset5/: Merging and filtering scripts for ddRADseq dataset 5
- 01filtering/: Filtering script for ddRADseq dataset 5
- 04dataset6/: Merging and filtering scripts for ddRADseq dataset 6
- 01filtering/: Filtering script for ddRADseq dataset 6
- 05dataset7/: Merging and filtering scripts for ddRADseq dataset 7
- 01filtering/: Filtering script for ddRADseq dataset 7
- 06dataset8_9/: Merging and filtering scripts for ddRADseq dataset 8 & 9
- 01filtering_dataset8/: Filtering script for ddRADseq dataset 8
- 01filtering_dataset9/: Filtering script for ddRADseq dataset 9
- 07dataset10/: Merging and filtering scripts for ddRADseq dataset 10
- 01filtering/: Filtering script for ddRADseq dataset 10
- 09dataset11/: Merging and filtering scripts for ddRADseq dataset 11
- 01filtering/: Filtering script for ddRADseq dataset 11
- 01dataset1_2*_*3/: Merging and filtering scripts for ddRADseq datasets 1, 2, & 3
04ddRADseq_analysis/: Scripts used for ddRAD-seq analysis
- 01ADMIXTURE/: Clustering analysis with ADMIXTURE and plotting
- 02PCA/: Principal component analysis with plink and plotting
- 03diversity_indices/: Calculation of nucleotide diversity (π) and pairwise FST/dXY between populations
- 01dataset2/: Script for ddRAD-seq dataset 2 (π, FST, dXY)
- 02dataset3/: Script for ddRAD-seq dataset 3 (π, FST, dXY)
- 03Cannularis_published_data/: Script for published gebotyping data of C. annularis (RAD_dataset2.vcf.gz in https://doi.org/10.5061/dryad.7wm37pw09) (only dXY)
- 04triangle_plot/: Scripts to calculate hybrid index and inter-population heterozygosity for hybrid populations and to produce a triangle plot based on dataset1
- 05hybrid_detection_by_Dsuite/: Scripts for allele sharing pattern analysis to detect hybridization between populations/lineages based on dataset 5
- 06RAxML/: Phylogenetic analysis with RAxML
- 01rooted_ML_on_dataset6/: Scripts for phylogenetic analysis on ddRAD-seq dataset6 (rooted)
- 07BayesAss/: Estimation of contemporary gene flow between populations with BayesAss
- 01Cgulosus_7pop_dataset7/: Script for seven populations of C. gulosus (ddRAD-seq dataset 7) around the Kyushu region
- 02Cannularis_7pop_dataset12/: Script for seven populations of C. annularis (ddRAD-seq dataset 12) around the Kyushu region
- 03Cannularis_7pop_reduced_dataset13/: Script for seven populations of C. annularis (reduced sample size; ddRAD-seq dataset 13) around the Kyushu region
- 04Cgulosus_2pop_dataset8/: Script for two populations of C. gulosus (ddRAD-seq dataset 8) around northern Japan
- 05Cannularis_2pop_dataset14/: Script for two populations of C. annularis (ddRAD-seq dataset 14) around northern Japan
- 06Cannularis_2pop_reduced_dataset16/: Script for two populations of C. annularis (reduced sample size; ddRAD-seq dataset 16) around northern Japan
- 08demographic_modeling_by_fastsimcoal2/: Demographic modeling analyses by fastsimcoal2
- 01block_bootstrap/: Common scripts to perform 100kb block bootstrap for the target VCF file. The output bootstrapped VCFs were used to calculate the confidence interval for each parameter after determining the best model in each modeling.
- 02scripts_to_run_fsc/: Common scripts to run fastsimcoal2
- 03CG_2pop_dataset9/: Modeling to quantify the level of historical gene flow between two populations of C. gulosus (ddRAD-seq dataset 9)
- 01easySFS/: Script to generate site frequency spectrum (SFS) from VCF with easySFS
- 02Examined_models_in_fastsimcoal2/: All .est and .tpl files describing the demography and parameter rules of the 13 examined models. The description of the bottom group of directories is omitted because only est and tpl are included in the directories of the corresponding models (The same applies below).
- 04CA_2pop_dataset15/: Modeling to quantify the level of historical gene flow between two populations of C. annularis (ddRAD-seq dataset 15)
- 01easySFS/: Script to generate site frequency spectrum (SFS) from VCF with easySFS
- 02Examined_models_in_fastsimcoal2/: All .est and .tpl files describing the demography and parameter rules of the 13 examined models
- 05CA_2pop_reduced_dataset17/: Modeling to quantify the level of historical gene flow between two populations of C. annularis (reduced sample size; ddRAD-seq dataset 17)
- 01easySFS/: Script to generate site frequency spectrum (SFS) from VCF with easySFS
- 02Examined_models_in_fastsimcoal2/: All .est and .tpl files describing the demography and parameter rules of the 13 examined models
- 06CG_4lin_dataset10/: Modeling to estimate the demography of four intraspecific lineages in C. gulosus
- 01easySFS/: Script to generate site frequency spectrum (SFS) from VCF with easySFS
- 02Examined_models_in_fastsimcoal2/: All .est and .tpl files describing the demography and parameter rules of the 18 examined models.
- 09EcoEvolity/: Testing co-divergence among the multiple pairs of intraspecific divergence with EcoEvolity
- Example/: Example scripts to perform EcoEvolity
Hierarchical_structure_of_scripts.txt: Hierarchical structure of the above-mentioned directories and inside scripts
software_version_list.csv: A CSV file that lists the software, modules, and packages used in the scripts above, along with their versions. The first column is the software name, the second column is the version, the third column is the brief description of the purpose of use, and the fourth column is the remark. For items that do not apply, “not applicable” is written.
"aligned_mitochondrial_sequences.tar.gz"
- cytb_ND2_outgroup_aligned.fasta: Aligned mitochondrial sequence data consisting of concatenated sequences (the region surrounding NADH dehydrogenase 2 and the partial cytochrome b gene). This includes 153 samples of C. gulosus, 24 samples of C. annularis, and 2 outgroup species. Alignment was performed by Mafft version 7.
- cytb_only_outgroup_aligned.fasta: Aligned mitochondrial sequence data for the partial cytochrome b gene. This file includes 154 samples of C. gulosus, 518 samples of C. annularis, and 2 outgroup species. Alignment was performed by Mafft version 7.
"pseudoreference_sequence_of_Cgulosus.tar.gz"
- pseudo_C_gulosus.fa.gz: The compressed FASTA file for pseudoreference sequence of Chaenogobius gulosus generated in this study. This sequence was generated by incorporating the single-nucleotide variant detected in one whole genome sequence of C. gulosus (accession: DRR489921) into the reference for C. annularis (GCA_015082035.1). Scripts to generate this pseudoreference sequence are provided at "./scripts/02pseudoreference_generation/" in scripts.tar.gz.
"genotyping_data_of_ddRADseq.tar.gz"
This directory contains 19 compressed VCF files for ddRAD-seq data used in this study. Dataset1.vcf.gz to dataset11.vcf.gz are based on mapping to the pseudoreference sequence of C. gulosus. Dataset12.vcf.gz to dataset19.vcf.gz are based on mapping to the original reference sequence. Scripts to generate these VCF files are provided at "./scripts/03ddRADseq_genotyping/" in scripts.tar.gz.
- dataset1.vcf.gz: ddRADseq dataset 1 for population structure analysis (PCA, ADMIXTURE), and neighbor network. Containing 21 populations of C. gulosus.
- dataset2.vcf.gz: ddRADseq dataset 2 for triangle plot. Containing 21 populations of C. gulosus.
- dataset3.vcf.gz: ddRADseq dataset 3 for calculation of diversity indices (π, FST, dXY). Containing 21 populations of C. gulosus.
- dataset4.vcf.gz: ddRADseq dataset 4 For calculation of diversity indices (π, FST, dXY). Containing 21 populations of C. gulosus and three populations of C. annularis.
- dataset6.vcf.gz: ddRADseq dataset 5 for allele sharing analysis to detect hybridization. Containing 21 populations of C. gulosus and one populations of C. annularis.
- dataset6.vcf.gz: ddRADseq dataset 6 for rooted RAxML phylogeny. Containing 15 populations of C. gulosus and three populations of C. annularis.
- dataset7.vcf.gz: ddRADseq dataset 7 for BayesAss analysis of C. gulosus. Containing seven populations of C. gulosus around the Kyushu region.
- dataset8.vcf.gz: ddRADseq dataset 8 for BayesAss analysis of C. gulosus. Containing two populations of C. gulosus in northern Japan.
- dataset9.vcf.gz: ddRADseq dataset 9 for demographic modeling analysis to estimate the level of historical gene flow between populations in C. gulosus. Containing two populations of C. gulosus in northern Japan.
- dataset10.vcf.gz: ddRADseq dataset 10 for demographic modeling analysis to estimate the demography of four intraspecific lineages in C. gulosus. Containing two populations of C. gulosus in northern Japan.
- dataset11.vcf.gz: ddRADseq dataset 11 for EcoEvolity analysis to test co-divergence of intraspecific lineages between species. Containing two populations of C. gulosus (The Sea of Japan lineage and the East China Sea lineage). Used with dataset11.vcf.gz & dataset19.vcf.gz.
- dataset12.vcf.gz: ddRADseq dataset 12 for BayesAss analysis of C. annularis. Containing seven populations of C. annularis around the Kyushu region.
- dataset13.vcf.gz: ddRADseq dataset 13 for BayesAss analysis of C. annularis. Containing seven populations of C. annularis around the Kyushu region (reduced sample size).
- dataset14.vcf.gz: ddRADseq dataset 14 for BayesAss analysis of C. annularis. Containing two populations of C. annularis in northern Japan.
- dataset15.vcf.gz: ddRADseq dataset 15 for demographic modeling analysis to estimate the level of historical gene flow between populations in C. annularis. Containing two populations of C. annularis in northern Japan.
- dataset16.vcf.gz: ddRADseq dataset 16 for BayesAss analysis of C. annularis. Containing two populations of C. annularis in northern Japan (reduced sample size).
- dataset17.vcf.gz: ddRADseq dataset 17 for demographic modeling analysis to estimate the level of historical gene flow between populations in C. annularis. Containing two populations of C. annularis in northern Japan (reduced sample size).
- dataset18.vcf.gz: ddRADseq dataset 18 for EcoEvolity analysis to test co-divergence of intraspecific lineages between species. Containing two populations of C. annularis (The Sea of Japan lineage and the East China Sea lineage). Used with dataset11.vcf.gz & dataset19.vcf.gz.
- dataset19.vcf.gz: ddRADseq dataset 19 for EcoEvolity analysis to test co-divergence of intraspecific lineages between species. Containing two populations of C. annularis (The Sea of Japan lineage and the Pacific Ocean lineage). Used with dataset11.vcf.gz & dataset18.vcf.gz.
"demographic_modeling.tar.gz"
- 01two_Cgulosus_populations/: Results regarding demographic modeling for two populations of C. gulosus based on ddRAD-seq dataset9. Examined models are provided at "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/03CG_2pop_dataset9" in scripts.tar.gz.
- best_parameterset_in_best_model/: Best parameters in best model (model2a). Each parameter is described in "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/03CG_2pop_dataset9/02Examined_models_in_fastsimcoal2/model2a/" in scripts.tar.gz.
- bootstrap_results/: Summarized results of 100 bootstrapping to calculate confidence intervals for each parameter
- observed_MSFS/: Multi-dimensions site frequency spectrum (MSFS) used as an input file for the demographic modeling
- two_CGpop_AIC_distribution.csv: The AIC distributions for each model with the model name in the first column and the AIC in the second column.
- 02two_Cannularis_populations/: Results regarding demographic modeling for two populations of C. annularis based on ddRAD-seq dataset15. Examined models are provided at "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/04CA_2pop_dataset15" in scripts.tar.gz.
- best_parameterset_in_best_model/: Best parameters in best model (model4a). Each parameter is described in "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/04CA_2pop_dataset15/02Examined_models_in_fastsimcoal2/model4a/" in scripts.tar.gz.
- bootstrap_results/: Summarized results of 100 bootstrapping to calculate confidence intervals for each parameter
- observed_MSFS/: Multi-dimensions site frequency spectrum (MSFS) used as an input file for the demographic modeling
- two_CApop_AIC_distribution.csv: The AIC distributions for each model, with the model name in the first column and the AIC in the second column.
- 03two_Cannularis_populations_reduced_samples/: Results regarding demographic modeling for two populations of C. annularis based on ddRAD-seq dataset17. Examined models are provided at "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/05CA_2pop_reduced_dataset17" in scripts.tar.gz.
- best_parameterset_in_best_model/: Best parameters in best model (model4c). Each parameter is described in "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/05CA_2pop_reduced_dataset17/02Examined_models_in_fastsimcoal2/model4c/" in scripts.tar.gz.
- bootstrap_results/: Summarized results of 100 bootstrapping to calculate confidence intervals for each parameter
- observed_MSFS/: Multi-dimensions site frequency spectrum (MSFS) used as an input file for the demographic modeling
- two_CApop_reduced_AIC_distribution.csv: The AIC distributions for each model, with the model name in the first column and the AIC in the second column.
- 04four_Cgulosus_lineages/: Results regarding demographic modeling for four lineages of C. gulosus based on ddRAD-seq dataset10. Examined models are provided at "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/06CG_4lin_dataset10" in scripts.tar.gz.
- best_parameterset_in_best_model/: Best parameters in best model (model17). Each parameter is described in "./scripts/04ddRADseq_analysis/08demographic_modeling_by_fastsimcoal2/06CG_4lin_dataset10/02Examined_models_in_fastsimcoal2/model17/" in scripts.tar.gz.
- bootstrap_results/: Summarized results of 100 bootstrapping to calculate confidence intervals for each parameter
- observed_MSFS/: Multi-dimensions site frequency spectrum (MSFS) used as an input file for the demographic modeling
- four_CGlin_AIC_distribution.csv: The AIC distributions for each model, with the model name in the first column and the AIC in the second column.
"raw_phylogenetic_trees.tar.gz"
- ML_tree_for_cytb_ND2_MEGAX.nwk: The maximum likelihood tree for mitochondrial DNA (cytb + ND2) constructed by MEGA X.
- ML_tree_for_cytb_only_MEGAX.nwk: The maximum likelihood tree for mitochondrial DNA (cytb only) constructed by MEGA X.
- Bayesian_MCC_tree_for_cytb_ND2_BEAST.trees: The Maximum clade credibility (MCC) tree of the Bayesian phylogenetic analysis for mitochondrial DNA (cytb + ND2) constructed by BEAST.
- Bayesian_MCC_tree_for_cytb_only_BEAST.trees: The Maximum clade credibility (MCC) tree of the Bayesian phylogenetic analysis for mitochondrial DNA (cytb only), constructed by BEAST.
- ML_tree_for_ddRADseq_RAxML.nwk: The maximum likelihood tree for nuclear DNA based on the ddRAD-seq dataset6 constructed by RAxML.
Code/software
To view the files provided:
- tar: Used to extract .tar archives
- gunzip: Used to decompress .gz files
- less: Used to view the content of text files
These commands are standard tools available in most Unix/Linux environments
Software list with versions:
Please see "software_version_list.csv" in "scripts.tar.gz"
Access information
Sequences data used in this study are deposited in the DNA Data Bank of Japan (DDBJ) with the following accession number
mitochondrial sequences:
AB684846–AB684973, AB755375–AB755401, AY525784, MH682217, LC535411–LC535795, LC535574–LC535581, LC848718–LC848725, LC849257–LC849490
ddRAD-seq reads:
DRR175830–DRR175845, DRR175892–DRR175923, DRR175940–DRR175955, DRR489922–DRR490004, DRR490027–DRR490032, DRR490073, DRR613613–DRR613699
whole genome sequence reads:
DRR489921
Please see Table S2 & Table S3 in the paper for details regarding these data.
We obtained mitochondiral sequence data (154 samples from 26 locations) and ddRAD-seq data (88 samples from 21 locations) for Chaenogobius gulosus, a coastal goby species inhabiting the Japanese Archipelago. For comparison, we also used the published data for C. annularis as necessary. Our analyses included generation of the pseudo-reference sequence for C. gulosus, population genetic analysis, and Bayesian demographic inference, and demographic modeling.
