Skip to main content
Dryad

Population structure and demographic analyses of Acanthocybium solandri from the Indo-Pacific and Atlantic oceans

Cite this dataset

Thia, Joshua (2022). Population structure and demographic analyses of Acanthocybium solandri from the Indo-Pacific and Atlantic oceans [Dataset]. Dryad. https://doi.org/10.5061/dryad.dncjsxkz4

Abstract

This repository contains scripts, data and results for a populaton genomics study of genetic structure and demography of wahoo, Acanthocybium solandri, published in Journal of Biogeography:

Haro-Bilbao et al. (2021) Global connections with some genomic differentiation occur between Indo-Pacific and Atlantic Ocean wahoo, a large circumtropical pelagic fish.

In this work, we generated population allele frequencies for wahoo sampled at 11 locations around the globe using a pooled ezRAD approach. Using thousands of genome-wide SNPs, we demonstrated a significant (but subtle) genetic divide between wahoo from the Indo-Pacific and those from the Atlantic. This genetic differentiation likely occurs against a background of high gene glow throughout the evolutionary history of wahoo, as we inferred from demographic analysis of select population pairs within and between oceanic regions.

Analyses contained in this repository are for: (1) Filtering pooled ezRAD allele counts (assembled with dDocent and imputed using poolne_estim); (2) Estimation of genetic differentiation among globally sampled wahoo populations; (3) Estimation of site frequency spectra from joint allele frequencies among select population pairs; (4) Inference of demographic parameters (using δaδi); and (5) Generations of demographic simulation summary statistics.

Most of the analyses are performed in R and can be run directly from within the repository directory, this includes: allele filtering, estimation of genetic differentiation, estimaiton of site frequency spectra, and generation of demographic summary statistics. Demographic inference using δaδi requires setup of a Unix environment: input data files and execution scripts are provided, but their implementation needs to be customised.

Methods

Allele frequency data was obtained through a pooled ezRAD approach. De novo assembly of RAD contigs and variant calling was performed using the dDocent pipeline. Population allele frequencies were imputed using poolne_estim. Additional quality filtering was performed in R.

Analysis of genetic differentiation was performed in R, which include: estimates of FST and AMOVA (analysis of molecular variance). 

Generation of site frequency spectra and summary of demographic analyses was performed in R.

Demographic inference was performed using δaδi, originally on an HPC. 

 

Usage notes

All R code can be run from within the respository directory using the R project file, Wahoo_PROJ.Rproj.

Demographic analyses using δaδi must be run in a Unix environment. The scripts Wahoo_DADI_Demog_Models.py and Wahoo_DADI_Generic_Execute.py can be used to set up a pipeline for executing demographic simulations in a local system or on an HPC cluster.