Skip to main content

Close relatives in population samples: Evaluation of the consequences for genetic stock identification

Cite this dataset

Östergren, Johan; Palm, Stefan; Gilbey, John; Dannewitz, Johan (2020). Close relatives in population samples: Evaluation of the consequences for genetic stock identification [Dataset]. Dryad.


Determining the origin of individuals in mixed population samples is key in many ecological, conservation and management contexts. Genetic data can be analyzed using Genetic Stock Identification (GSI), where the origin of single individuals is determined using Individual Assignment (IA) and population proportions are estimated with Mixed Stock Analysis (MSA). In such analyses, allele frequencies in a reference baseline are required. Unknown individuals or mixture proportions are assigned to source populations based on the likelihood that their multilocus genotypes occur in a particular baseline sample. Representative sampling of populations included in a baseline is important when designing and performing GSI. Here we investigate the effects of family sampling on GSI, using both simulated and empirical genotypes for Atlantic salmon (Salmo salar). We show that non-representative sampling leading to inclusion of close relatives in a reference baseline may introduce bias in estimated proportions of contributing populations in a mixed sample, and increases the amount of incorrectly assigned individual fish. Simulated data further show that the induced bias increases with increasing family structure, but that it can be partly mitigated by increased baseline population sample sizes. Results from standard accuracy tests of GSI (using only a reference baseline and/or self-assignment) gave a false and elevated indication of the baseline power and accuracy to identify stock proportions and individuals. These findings suggest that family structure in baseline population samples should be quantified and its consequences evaluated, before carrying out GSI.


Tissue samples collected from Atlantic salmon. Tissue samples consisted of fin clips from hatcheries stored individually in labeled tubes with ethanol (95%). DNA was extracted followed by PCR and genotyping of 17 polymorphic microsatellite markers (on average c. 10 alleles/locus).

Usage notes

Among the 1870 individuals in the original empirical baseline population samples, 96.5 % had complete genotypes at all 17 microsatellites; one individual had missing data at three loci, five at two loci and 60 at one locus, resulting in overall 0.23 % missing genotypes. Repeat genotyping of a sub-set of individuals resulted in a repeatability of 100 %, and hence an estimated error rate of zero.


Swedish Research Council Formas, Award: 2013-1288

Swedish Research Council for Environment Agricultural Sciences and Spatial Planning, Award: 2013‐1288

Swedish Agency for Marine and Water Management