Skip to main content

Data from: Natal philopatry increases relatedness within groups of coral reef cardinalfish

Cite this dataset

Rueger, Theresa et al. (2020). Data from: Natal philopatry increases relatedness within groups of coral reef cardinalfish [Dataset]. Dryad.


A central issue in evolutionary ecology is how patterns of dispersal influence patterns of relatedness in populations. In terrestrial organisms, limited dispersal of offspring leads to groups of related individuals. In contrast, for most marine organisms, larval dispersal in open waters is thought to minimise kin associations within populations. However, recent molecular evidence and theoretical approaches have shown that limited dispersal, sibling cohesion, and/or differential reproductive success can lead to kin-association and elevated relatedness. Here, we tested the hypothesis that limited dispersal explains small-scale patterns of relatedness in the pajama cardinalfish Sphaeramia nematoptera. We used 19 microsatellite markers to assess parentage of 233 juveniles and pairwise relatedness among 527 individuals from 41 groups in Kimbe Bay, Papua New Guinea. Our findings support three predictions of the limited dispersal hypothesis: 1) Elevated relatedness within groups, compared to among groups, and elevated relatedness within reefs compared to among reefs; 2) A weak negative correlation of relatedness with distance; 3) More juveniles than would be expected by chance in the same group and the same reef as their parents. We provide the first example for natal philopatry at the group level causing small-scale patterns of genetic relatedness in a marine fish. 


Study location and sample collection

The study was conducted on inshore reefs near Mahonia Na Dari Research and Conservation Centre, Kimbe Bay, Papua New Guinea (5°30’S, 150°05’E), from October 2012 to September 2014. A total of 41 social groups of pajama cardinalfish, Sphaeramia nematoptera, were comprehensively sampled from nearshore reefs and along a fringing reef. The whole population was sampled as well as possible with the means available, during two years and more than 500 hours of surveying.

A total of 527 S. nematoptera were caught using hand nets and diluted clove oil as a mild anaesthetic (Munday & Wilson 1997). Each fish was measured underwater (Standard Length SL) and a fin clip was taken from the caudal fin. Tissue samples were preserved in 99% ethanol for genetic analysis. All fish were categorized as either adult (≥38mm SL), subadult (33-37mm SL) or juvenile (<33mm SL), with categories assessed by gonad histology (Rueger et al. 2016).  

Genetic analyses and locus characteristics

Genomic DNA was extracted from ~2 mm2 of fin tissue collected from each individual and screened at 23 microsatellite markers in four multiplexes (Rueger et al. 2015). DNA extractions were performed following procedures described in the Nucleospin-96 Tissue kit (Macherey-Nagel, Germany). Selected primer pairs were combined in a primer premix for in-reaction concentrations ranging from 0.02 to 0.06 μM, adjusted for even amplification. All four multiplex reactions were performed using the QIAGEN Microsatellite Type-it kit (QIAGEN, Germany) in a total volume of 10 ml containing 5 ml of QIAGEN Multiplex Master Mix (2x), 1 ml QIAGEN Q-solution, 1 ml of distilled water, 2 ml of primer premix, and 1 ml template DNA. PCR products were screened on an ABI 3370xl DNA Analyzer (Applied Biosystems) with the GeneScan 500 LIZ (Applied Biosystems) internal size standard following a 1:15 dilution. Individual genotypes were scored in genemapper v4.0 and unique alleles were distinguished using marker specific binsets in msatallele (Alberto 2009). 

Allele frequencies, linkage disequilibrium and deviation from Hardy-Weinberg equilibrium were estimated with Genepop (Raymond and Rousset, 1995) and the data was checked for the presence of null alleles with Microchecker (van Oosterhout et al 2004). Genotyping error was assessed using repeat samples from 43 individuals and calculated as the ratio between mismatches in alleles and the number of replicated alleles (Pompanon et al. 2005). For further analysis, we used the 19 markers with the lowest genotyping error, <6%. Marker specific summary statistics are provided in Supplemental Material Table 1.

Pairwise relatedness at different spatial scales

The relatedness of any two of sampled individuals was assessed using the relatedness moment-based estimator described by Queller & Goodnight (1989) calculated by COANCESTRY (Wang 2011). To test accuracy, we first simulated 1000 individuals from the estimated allele frequencies at each locus. The rate of missing allele was set to 0.01 for all loci and locus-specific genotyping error rates were used. The correlation with the true values for the Queller & Goodnight moment-based estimator (1989) was high (Pearson’s r = 0.941, p < 0.001). 

Relationship between pairwise relatedness and distance

Distance was calculated using the Cartesian coordinates of each social group (Grant et al. 2006). With the aim of determining whether pairwise relatedness estimates change with distance, an analysis of variance was performed with mean dyad relatedness as response variable, and distance categorized in five bins (0m, 0-200m, 200-400m, 400-600m, >600m) as predictor variable. To account for possible positive autocorrelation between dyad samples, we randomly sampled 10 dyads from each bin without replacement and calculated the mean (Kraemer et al. 2016). We replicated each set 100 times. All statistical analyses were performed using R version 3.6.1. (R core team 2019).

Natal philopatry 

To determine whether natal philopatry at the scale of the group or reef occurred, we used the same 19 microsatellites (Rueger et al. 2015, Appendix D) to match juveniles and subadults to potential parents. Parentage was assessed with COLONY, with the following parameters; Full likelihood, medium likelihood precision, long run. As per Harrison et al. (2014), these parameters were shown to yield a very high overall accuracy for this marker set identifying true parent-offspring pairs (99.9%; type‐I error 0.1%, type‐II error 0%) (see Rueger et al. 2019 for details of the simulation). 

Usage notes

This dataset consist of

  • individual genotypes (tab 'Genotypes'). This data can be used for GenAlEx or Popgen analyses
    • First row: number of markers, number of individuals, number of populations, number of individuals in population 1
    • Third row: individual ID, population ID, codominant marker IDs
  • individual natural history data in combination with the genotypes (tab 'Genotypes with fish info') which combines all available data
    • Plate/Well: refer to stored samples
    • Site: Group ID
    • SL: standard length
    • TL: total length
  • relatedness values for each dyad in the population (tab 'Relatedness COANCESTRY). This data can be used to look at differences in relatedness within/between groups.
    • TrioEst, WEst, LLEst.. .: different relatedness estimates. See COANCESTRY documentation for details 
  • parentage data for all juveniles sampled (tab 'Parentage COLONY). This data shows recruitment patterns.
    • MotherID/ FatherID: biological parents identified by COLONY
    • Cluster index/ probability: COLONY estimates
    • X.GD: natural history info for genetic father (geno dad)
    • X.GM: natural history info for genetic mother
    • X.Off: natural history info for offspring (the juvenile tested)
  • Suplpemental material: summary of marker specific estimates and GenBank acsession numbers.
  • DistanceMeans: mean values calculated for 'Relationship between pairwise relatedness and distance' analysis (see Methods)


Australian Research Council, Award: CE140100020