Population structure and connectivity among coastal and freshwater Kelp Gull (Larus dominicanus) populations from Patagonia
Data files
Mar 13, 2024 version files 2.59 GB
-
Kelp_assembly.tar.gz
2.59 GB
-
README.md
3.38 KB
Abstract
The genetic identification of significant evolutionary units and information on their connectivity can be used to design effective management and conservation plans. Despite having high dispersal capacity, several seabird species show population structure due to both abiotic and biotic barriers to gene flow. The Kelp Gull is the most abundant species of gull in the southern hemisphere. In Argentina it reproduces in both marine and freshwater environments, with more than 100,000 pairs following a metapopulation dynamic across 140 colonies in the Atlantic coast of Patagonia. However, little is known about the demography and connectivity of inland populations. We aim to provide information on the connectivity of the largest freshwater colonies (those from Nahuel Huapi Lake) with the closest Pacific and Atlantic populations to evaluate if these freshwater colonies are being subsidized by the larger coastal populations. We sampled three geographic regions (Nahuel Huapi Lake and the Atlantic and Pacific coasts) and employed a reduced-representation genomic approach to genotype individuals for single-nucleotide polymorphisms (SNPs). We found, using clustering and phylogenetic analyses, that there are three genetic groups, each corresponding to one of our sampled regions. Individuals from marine environments are more closely related to each other than to those from Nahuel Huapi Lake, indicating that the latter population constitutes the first freshwater Kelp Gull colony to be identified as a significant evolutionary unit in Patagonia.
README: Description of the files in Kelp_assembly.tar.gz
There are a total of 133 files in the folder. These belong to a ddRAD assembly generated with the software Stacks version 2.3e. The files can be devided in the categories shown below.
Aligned Sequences
There are 117 files derived from sequences from different individuals (one file per individual) aligned to the Herring Gull reference genome (GCA_013400295.1_ASM1340029v1_genomic.fna
) as indicated in the paper. Each file is named with a number and a series of letters. The letters indicate both the species (starting wit GC for the common Spanish name of the study species, "Gaviota Cocinera") and the geographic origin of the sample. The geographic origin is indicated first with letters denoting the specific locality and subsequently with the general region as follows:
Pacific coast (CP)
- Maiquillahue (VA)
- Isla Conejo (IC)
- Puñihuil (PU)
Nahuel Huapi Lake (LNH)
- Islets Gaviotero 1 and Gaviotero 2 (G1, G2)
- Isla el Roble (RO)
Atlantic coast (CA)
- Isla de Los Pájaros (IP)
- Punta León (PL)
Finally, the numbers give each individual a unique identifier (as there are several individuals sampled in the same geographic region).
This is an example of how each file is named:
- Kelp_assembly/100GCG1LNH.bam
- Kelp_assembly/101GCG1LNH.bam
- Kelp_assembly/102GCG1LNH.bam
- ...
- Kelp_assembly/98GCG1LNH.bam
- Kelp_assembly/99GCG1LNH.bam
STACKS Catalog and Associated Statistics
The following four files were generated by the gstacks module of the Stacks software. The catalog of RAD loci is generated from the collection of individuals aligned to the reference genome. The populations module of the software can be used to export different subsets of the dataset (e.g., after filtering).
- Kelp_assembly/catalog.calls
- Kelp_assembly/catalog.fa.gz
- Kelp_assembly/gstacks.log
- Kelp_assembly/gstacks.log.distribs
Population Module Output Files from STACKS
The following 12 files are derived from the ddRAD assembly (see details next to each file).
- Kelp_assembly/Kelp.structure #Genotypes for each individual in Stucture format.
- Kelp_assembly/populations.fst_1-2.tsv #F-statistics between LNH (pop 1) and CA (pop 2).
- Kelp_assembly/populations.fst_1-3.tsv #F-statistics between LNH (pop 1) and CP (pop 3).
- Kelp_assembly/populations.fst_3-2.tsv #F-statistics between CA (pop 2) and CP (pop 3).
- Kelp_assembly/populations.fst_summary.tsv #Summary of F-statistics for the three populations.
- Kelp_assembly/populations.haps.radpainter #Haplotypes exported in radpainter format for the software fineRadStructure.
- Kelp_assembly/populations.haps.vcf #Haplotypes exported in vcf format.
- Kelp_assembly/populations.phistats_1-2.tsv #Phi-statistics derived from haplotype frequencies (populations numbered as for F-statistics).
- Kelp_assembly/populations.phistats_1-3.tsv #Phi-statistics derived from haplotype frequencies (populations numbered as for F-statistics).
- Kelp_assembly/populations.phistats_3-2.tsv #Phi-statistics derived from haplotype frequencies (populations numbered as for F-statistics).
- Kelp_assembly/populations.phistats_summary.tsv #Summary of Phi-statistics for the three populations.
- Kelp_assembly/populations.sumstats_summary.tsv #Other statistics like FIS, heterozygosity, etc, per population.
Methods
These are ddRAD data generated following this paper.
Thrasher DJ, Butcher BG, Campagna L, Webster MS, Lovette IJ. Double‐digest RAD sequencing outperforms microsatellite loci at assigning paternity and estimating relatedness: A proof of concept in a highly promiscuous bird. Mol Ecol Res. 2018; 18(5): 953-965.