Genomic data resolve long-standing uncertainty by distinguishing white marlin (Kajikia albida) and striped marlin (K. audax) as separate species
Data files
Jun 27, 2023 version files 54.97 MB
Abstract
Large pelagic fishes are often broadly and continuously distributed and capable of long-distance movements. These factors can promote gene flow that makes it difficult to disentangle intra- vs. inter-specific levels of genetic differentiation. Here, we assess the relationship of two istiophorid billfishes, white marlin (Kajikia albida) and striped marlin (K. audax), presently considered sister species inhabiting separate ocean basins. Previous studies report levels of genetic differentiation between white marlin and striped marlin that are smaller than those observed among populations of other istiophorid species. To determine whether white marlin and striped marlin comprise separate species or populations of a single globally distributed species, we surveyed 2520 single nucleotide polymorphisms (SNPs) in 62 white marlin and 242 striped marlin sampled across the Atlantic, Pacific, and Indian oceans. Multivariate analyses resolved white marlin and striped marlin as distinct groups, and a species tree composed of separate lineages was strongly supported over a single lineage tree. Genetic differentiation between white marlin and striped marlin (FST = 0.5384) was also substantially larger than between populations of striped marlin (FST = 0.0192–0.0840), and we identified SNPs that allow unambiguous species identification. Our findings indicate that white marlin and striped marlin comprise separate species, which we estimate diverged at approximately 2.38 Mya.
Methods
This dataset comprises raw genotypes for genome-wide SNPs produced by Diversity Arrays Technology (https://www.diversityarrays.com/) using DArTseq methodology (Sansaloni et al. 2011 BMC Proceedings). These data have not undergone any additional quality filtering. Genotypes in this dataset correspond with n = 242 striped marlin (Kajikia audax) sampled across the Pacific and Indian oceans and n = 62 white marlin (K. albida) sampled across the Atlantic Ocean. These data were produced using a digestion with the PstI and SphI restriction enzymes and used for 77-bp single end high-throughput sequencing on an Illumina HiSeq 2500.
Usage notes
Provided are the following data files:
- Mamoozadeh_etal_raw_DArTseq_genotypes.csv – A CSV formatted file containing raw DArTseq genotypes for striped marlin and white marlin as provided by Diversity Arrays Technology.
- Mamoozadeh_etal_raw_DArTseq_genotypes_metadata.csv – A CSV formatted file containing metadata for the striped marlin and white marlin contained in the DArTseq genotypes file. These metadata include the sample ID shown in the genotypes file, the geographic region from where the individual was sampled, and the genetically distinct population the individual was assigned to by Mamoozadeh et al. 2020 (Evolutionary Applications).
- Mamoozadeh_etal_raw_DArTseq_genotypes_fixed_alleles.csv – A CSV formatted file containing SNP information for n = 71 loci exhibiting fixed allelic differences between striped marlin and white marlin. This information is derived from the raw data provided by Diversity Arrays Technology.