Data for: Comparative evaluation of phenotypic, pedigree, and family-based selection in insect breeding using stochastic simulation

Hansen, Laura Skrubbeltrang 1 ; Bouwman, Aniek C.2; Sahana, Goutam1; Slagboom, Margot1; Nielsen, Hanne Marie1; Ellen, Esther E.2

Published Feb 08, 2025 on Dryad. https://doi.org/10.5061/dryad.h18931zx7

Data files

Feb 08, 2025 version files 4.96 GB

data.zip

4.96 GB
README.md

3.28 KB

Abstract

Selective breeding in insects has predominantly relied on phenotypic selection without considering relatedness. Selection on estimated breeding values could potentially increase genetic gain, but the challenge of pedigree tracking complicates this. Family selection can be used as an alternative to individual selection, either using combined between- and within-family selection, or strict between-family selection with full-sib group records as a proxy for individual data. The effectiveness of family selection can however be compromised by the presence of unmitigated common environment effects. In this study, we employ stochastic simulations to explore expected genetic gain and rate of inbreeding in insect populations under four single-trait selection schemes: phenotypic selection, individual pedigree selection, combined selection using both family and individual breeding values for selection, and between-family selection using full-sib average phenotypes for breeding value estimation. These schemes are compared on genetic gain and rate of inbreeding across five trait heritabilities (0.05, 0.1, 0.2, 0.4 and 0.6), two variations in number of families in the population (60 or 200), and two offspring group structures for the family breeding schemes (1 or 3 sib groups per female) with a fixed common environment effect. Selection based on individual breeding values results in significantly higher genetic gain than phenotypic selection at low heritability (≤0.1), and similar gain at heritability >0.1. Phenotypic selection results in a lower rate of inbreeding (0.003-0.011) compared to other schemes (0.005-0.055) at low heritability (≤0.1), but this difference is reduced as heritability increases. Combined selection results in genetic gain between that of the phenotypic and individual pedigree schemes, depending on sib group structure and heritability. Using between-family selection reduces genetic gain (0.23-1.97) compared to other schemes (0.40-4.34). Establishing multiple sib-groups mitigates the confounding of genetic and common environment effects, and thus the reduction in genetic gain from family selection schemes. Increasing the number of families from 60 to 200 in the breeding population reduces inbreeding in all scenarios (ΔF at 60 families is 0.009-0.055, at 200 families is 0.003-0.031). We conclude that selection on individual breeding values yields greater genetic gain compared to family breeding values and selection on phenotypes. The between-family approach is an alternative when individual pedigrees are not feasible to maintain. Phenotypic selection results in both high genetic gain and generally low rates of inbreeding, but as heritability increases, so does the rate of inbreeding. Therefore, phenotypic selection should not be implemented without any inbreeding control in long term selection.

https://doi.org/10.5061/dryad.h18931zx7

Description of the data and file structure

All data is simulated.

Files and variables

Directory and file naming structure

4 directories:

CombSel: directory for combined selection scenarios

FamSel: directory for family selection scenarios

IndPedSel: directory for individual pedigree selection scenarios

PhenSel: directory for phenotypic selection scenarios

Each directory has two folders:

60_families: scenarios where the population size is 60 families

200_families: scenarios where the population size is 200 families

FamSel and CombSel have two extra directories in each of the 60_families and 200_families folders:

1_group: scenarios where offspring are not split in multiple sib-groups

3_groups: scenarios where offspring are split in three sib-groups

Each lowest folder level contains 5 .txt files with the following name structure:

scenario+size_groups_replicates_total_data_heritability

scenario = CombSel, FamSel, IndPedSel, PhenSel

size = small (60 families), large (200 families)

groups = either nothing, 1G (1 group), 3G (3 groups)

replicates = Number of replicates of the scenario (always 20)

heritability = h2005 (0.05), h201 (0.2), h202 (0.2), h204 (0.4), h206 (0.6)

File structure and variables

Each file contains 27 variables and 1.081.800 observations.

There are 15 generations of selection replicated 20 times in each data file.

Two traits are simulated but only one is part of the selection scheme (trait 1).

id: identification of individual

sire: identification of father

dam: identification of mother

sex: male (m) or female (f)

generation: generation 1-15

group: id of female + A, B or C

at1: additive genetic effect for trait 1 (the selection trait)

at2: additive genetic effect for trait 2 (correlated trait, not included in the study. Genetic correlation is -0.3, heritability of trait 2 is 0.4)

ct1: common environment effect for trait 1 (only in CombSel and FamSel)

ct2: common environment effect for trait 2 (only in CombSel and FamSel)

et1: residual effect for trait 1

et2: residual effect for trait 2

TBV_t1: deviation from trait mean for trait 1 (at1 + ct1 + et1)

TBV_t2: deviation from trait mean for trait 2 (at2 + ct2 + et2)

t1: trait value for trait 1

t2: trait value for trait 2

F: inbreeding coefficient

selected: selected as parent of next generation. 0 = no, 1 = yes.

indBV: always NA

mean_t1: family average value for trait 1

f: number of females in the family

m: number of males in the family

EBV_t1: estimated breeding value for trait 1

candM: minimum number of males in the family. either 2 (there are 2 or more males in the family after filtering) or NA (there are not 2 or more males in the family after filtering)

candF: minimum number of females in the family. either 4 (there are 4 or more females in the family after filtering) or NA (there are not 4 or more females in the family after filtering)

candidate: whether family is a candidate family or not, based on filtering criteria. Y = Yes, N = No.

rep: replicate (1-20)