Complex patterns of genetic population structure in the mouthbrooding marine catfish, Bagre marinus, in the Gulf of Mexico and U.S. Atlantic

Portnoy, David1 ; O'Leary, Shannon2; Fields, Andrew 1 ; Hollenbeck, Christopher1; Grubbs, Dean3; Peterson, Cheston3; Gardiner, Jayne4; Adams, Douglas5; Falterman, Brett6; Drymon, Marcus7; Higgs, Jeremy8; Pulster, Erin9; Wiley, Tonya10; Murawski, Steven11

Published May 05, 2025 on Dryad. https://doi.org/10.5061/dryad.nvx0k6f0n

Data files

May 05, 2025 version files 28.82 MB

BMA_by_pop_all-outl_genepop.gen

351.69 KB
BMA_by_pop_genepop.gen

14.40 MB
BMA_by_pop_only-neutral_genepop.gen

14.05 MB
README.md

2.13 KB
SampleInfo.txt

17.30 KB

Abstract

Patterns of genetic variation reflect interactions among microevolutionary forces that vary in strength with changing demography. For marine species, these patterns are often interpreted under the expectation that larval movement drives connectivity because most marine species exhibit broadcast spawning dispersal strategies. Here, patterns of variation within and among samples of the mouth brooding gafftopsail catfish (Bagre marinus, Family Ariidae) captured in the U.S Atlantic and throughout the Gulf of Mexico were analyzed using genomics to generate neutral and non-neutral SNP data sets. Because genomic resources are lacking for ariids, linkage disequilibrium network analysis was used to examine patterns of putatively adaptive variation. Finally, historical demographic parameters were estimated from site frequency spectra. The results show four differentiated groups, corresponding to the (1) U.S. Atlantic, and the (2) northeastern, (3) northwestern, and (4) southern Gulf of Mexico. Patterns of genetic variation for the neutral data resemble that of other fishes that use the same estuarine habitats as nurseries, regardless of the presence/absence of a dispersive larval phase, supporting the idea that adult/juvenile behavior and habitat are important predictors of contemporary patterns of genetic structure. The non-neutral data presented two contrasting signals of structure, one due to increases in diversity moving west to east and north to south, and another to increased heterozygosity in the Atlantic. Demographic analysis suggested recently reduced long-term effective population size in the Atlantic is likely an important driver of patterns of genetic variation and is consistent with a known reduction in population size potentially due to an epizootic.

Sampling and library prep

Fin clips were obtained from 382 mixed-age samples of gafftopsail catfish collected from nine geographic sampling locations (hereafter locations; Figure 1) from 2015 to 2018: one in the Atlantic in Indian River Lagoon, Florida and adjacent coastal waters (ATL) and eight in the Gulf. Locations in the Gulf were near Tampa Bay, Florida (FLGS), North of Tampa Bay, Florida (FLGN), near Mobile Bay, Alabama, (MB), in Mississippi Sound, Mississippi (MISS), in Chandeleur Sound, LA (CS), off Louisiana west of the Mississippi River (LA), in Corpus Christi Bay, Texas (CC) and in the Bay of Campeche, Mexico (CAMP). All locations were selected because they represent inshore habitats used by mouth brooding males for parturition and by juveniles as nursery habitat, except CAMP which was opportunistically sampled further offshore. Sampling took place as part of surveys routinely conducted by state or academic entities, the latter following approved animal care protocols. All fin clips were preserved in 20% DMSO-0.25M EDTA-saturated NaCl buffer (Seutin et al., 1991) and stored at room temperature until time of extraction.

DNA was extracted using Mag-Bind Tissue DNA kits (Omega Bio-Tek, Norcross, GA) and 500-1000 ng of high-quality genomic DNA used in a modified version of the ddRAD genomic library preparation method (Peterson et al., 2012). Briefly, genomic DNA was digested with two restriction endonucleases (EcoRI, MspI), and a barcoded adapter was ligated to EcoRI sites while a common adapter was ligated to MspI sites. Following adapter ligation, individuals were pooled by index and size-selected using a Pippin Prep size-selection system (Sage Science, Beverly, MA) to a standard size range (338 – 412 base pairs). Polymerase chain-reaction (PCR) amplification of fragments was performed to incorporate adaptors necessary for annealing to an Illumina flow cell and index-specific identifiers. Index pools were then combined into libraries of approximately 150 individuals spread across the geographic range of sampling and duplicate individuals (technical replicates), and three libraries were sequenced (paired-end) each on a lane of an Illumina HiSeq 4000 DNA sequencer at GeneWiz^®, New Jersey, USA.

Genotyping

RAD sequences retrieved from each run were demultiplexed using process_radtags (Catchen et al., 2011) and quality trimming, reduced-representation reference assembly, read mapping and SNP calling were performed using the dDocent pipeline (Puritz et al., 2014). The ten individuals with the highest number of reads were selected from each lane for de novo reduced-representation reference assembly, using the overlapping read (OL) assembly option in dDocent. Similarity threshold for clustering (c = 0.8), minimum within individual coverage (K1 = 5) and minimum number of individuals a read must occur in to be included (K2 = 2) were chosen after comparing mapping statistics for ten individuals randomly chosen from each library and mapped to references generated for c = 0.8, K1 = 2 – 10, and K2 = 1 – 10 using BWA (Li & Durbin, 2009) to maximize the number of reads mapped as a proper pair and minimize reads where forward and reverse reads mapped to different contigs. The constructed reduced-representation reference encompassed a total 10,874,990 base pairs across 37,872 fragments (mean 287 bp; mode 307 bp).

Reads were mapped to the reduced-representation reference using BWA (Match=1, mismatch penalty=3 and gap penalty=5; Li, 2013) and SNPs called using freebayes (Garrison & Marth, 2012). The resulting data set was filtered to remove low quality and artefactual SNPs, paralogs, and low-quality individuals using vcftools (Danecek et al., 2011) and custom scripts following O’Leary et al. (2018), allowing for the retention of SNPs with more than 2 alleles. Genotypes with quality < 20 and < 5 reads were coded as missing, retaining loci with quality > 20, genotype call rate > 90%, and mean depth 15 – 300. Loci were also filtered based on allelic balance (remove SNPs < 0.25 and >0.75), mapping quality ratios (remove SNPs < 0.25 and >1.75), strand balance (remove SNPs with > 100x more forward alternate reads than reverse alternate reads and > 100x more forward reverse reads than reverse alternate reads), paired status, depth/quality ratio (< 0.2), and excess heterozygosity (remove SNPs > 0.5 and that deviate significantly from the expectations of Hardy-Weinberg Equilibrium). Individuals with > 25% missing data were removed. Finally, rad_haplotyper (Willis et al., 2017) was used to merge SNPs on the same fragments into SNP-containing loci (hereafter microhaplotypes), by using a random sample of 20 reads per locus and recording all possible haplotypes and then discarding haplotypes that are not possible given the SNPs present in the final dataset. Loci are flagged as paralogs if too may haplotypes are called given SNP genotypes. Genotyping error is flagged if an individual as too few haplotypes given SNP genotypes). The resulting haplotyped data set was further filtered to remove loci haplotyped in < 90% of individuals, flagged as potential paralogs in > 4 individuals, or as affected by genotyping error in > 10 individuals. Technical replicates were compared to assess genotyping error, and loci systematically affected by genotyping error or flagged as deviating significantly from the expectations of Hardy-Weinberg Equilibrium (HWE) in > 5 sites were removed.

Complex patterns of genetic population structure in the mouthbrooding marine catfish, Bagre marinus, in the Gulf of Mexico and U.S. Atlantic

Data files

Abstract

Description of the data and file structure

Metadata Key

Sharing/Access information

Code/Software

Complex patterns of genetic population structure in the mouthbrooding marine catfish, Bagre marinus, in the Gulf of Mexico and U.S. Atlantic

Data files

Abstract

README: Complex patterns of genetic population structure in the mouthbrooding marine catfish, Bagre marinus, in the Gulf of Mexico and U.S. Atlantic

Description of the data and file structure

Metadata Key

Sharing/Access information

Code/Software

Methods

Works referencing this dataset