Discrete genetic modules underlie divergent reproductive strategies in three-spined stickleback
Data files
Nov 07, 2025 version files 3.17 MB
-
Behavior_visualization_code.R
22.12 KB
-
F2_phenotypes.csv
54.40 KB
-
Genes_within_QTL.csv
303.28 KB
-
Linkage_map.csv
873.98 KB
-
marker_linkage_physical_locations.csv
59.52 KB
-
marker_sequences.fa
1.79 MB
-
QTL_mapping.R
54.25 KB
-
QTL_table_1.csv
1.34 KB
-
README.md
10.76 KB
Abstract
A central challenge in biology is to understand how complex behaviors evolve. Reproductive behaviors are frequently subject to strong selection and complex behavioral traits often evolve as an integrated package. However, it is unclear whether suites of traits evolve through a few pleiotropic genetic changes, each affecting many behaviors, or by accumulating several changes that, when combined, give rise to an entire package of correlated traits. Typically, three-spined stickleback exhibit paternal care, a behavior that characterizes the entire Gasterosteidae family. However, an unusual “white” three-spined stickleback ecotype exhibits a suite of traits associated with the evolutionary loss of paternal care. In the white ecotype, males disperse embryos from their nests rather than care for them, build loose nests, exhibit high rates of courtship, and are relatively small in body size. These differences are apparent in stickleback reared in a common garden environment, suggesting the differences have a heritable basis. In an F2 intercross (n=76-133), we show that these traits are genetically uncorrelated and map to different genomic regions, suggesting that components of the white reproductive strategy segregate independently and evolved through the addition of multiple genetic changes. Moreover, distinct sets of genes may be involved in regulating the same motor pattern across contexts. These results contribute to the growing body of evidence that behavioral diversity observed in nature may evolve by accumulating and combining alleles, each with modular effects, and show that this principle applies to a suite of behavioral traits that form an integrated strategy.
Study summary:
This dataset contains data on three-spined stickleback (Gasterosteus aculeatus) that originated from two Nova Scotian populations, referred to as "whites" and "commons". The populations were hybridized to generate F1 and F2 hybrids, and behavioral measurements were collected on the F2 hybrids to perform QTL mapping. We collected data on body size, territorial aggression, nest architecture, courtship, and parenting behaviors, and genotyped the F2 hybrids via RADseq. Significant QTL peaks and trait variation are reported. See the associated manuscript for more information
Software information:
The software R is required to run the analyses performed here. R version 4.0.2 was used in these analyses. The following R packages were loaded: qtl, qtlcharts, ggplot2, dplyr, ggpubr, corrplot, gplots, multcompView, cowplot, vegan, plyr, heplots
Data and File Overview:
The dataset includes five data files in .csv format, one sequence file in fasta format, and two R scripts required for visualizing behavioral variation and for performing qtl mapping. The QTL_mapping.R script performs all QTL mapping analyses, while the Behavior_visualization_code.R script analyzes differences in behavior and displays them with ggplot. All of the behavior data is contained within F2_phenotypes.csv, but note that the code and analysis calls upon a previously published dataset (Behrens, Colby, Meghan F. Maciejewski, Eric Arredondo, Anne C. Dalziel, Laura K. Weir, and Alison M. Bell. 2024. Divergence in Reproductive Behaviors Is Associated with the Evolutionary Loss of Parental Care. American Naturalist 203 (5): 590 603. https://doi.org/10.1086/729465.) as well to contrast F2 results reported here with F0 and F1 results that were reported previously. Genotype data is included in the Linkage_map.csv file, and QTL mapping summary results are in QTL_table_1.csv. Marker locations based on the threespine stickleback V5 genome (https://stickleback.genetics.uga.edu/downloadData/).
-File 1: Linkage_map.csv - This file includes genotyping data formatted for use in rqtl. The first row is the name of each molecular marker. The second row is the linkage group of the marker. The third row shows the location in cM of the marker in the linkage group. The first column is the numerical ID of each individual, and each space in the rest of the file shows the genotype (AA, AB, or BB) of the individual at that marker. Markers that are not genotyped are represented with "-".
-File 2: F2_phenotypes.csv - This file includes data from F2 hybrids, including body size, aggression, nesting, courtship, and parenting. Each variable has a prefix (e.g. parent0 for post-fertilization parenting stage, nest for nesting stage, aggro for aggression trials) to describe the the context of the trait. Variables are described below. Missing data code: NA
- sample_name: name of the focal fish
- ID: Number of the focal fish
- sex: sex of the individual. Only males ("1") are included.
- parent0_dispersal: egg dispersals during the post-fertilization parenting stage
- parent0_fan_bouts: fan bouts during the post-fertilization parenting stage
- parent0_glues: gluing behaviors during the post-fertilization parenting stage
- parent0_nest_bouts: nesting bouts during the post-fertilization parenting stage
- parent0_pokes: nest pokes during the post-fertilization parenting stage
- parent0_retrievals: egg retrievals during the post-fertilization parenting stage
- parent0_spits: sand spits during the post-fertilization parenting stage
- parent0_thru_nest: "through-nest" behaviors during the post-fertilization parenting stage
- parent0_fan_time: nest fanning time (ms) during the post-fertilization parenting stage
- parent0_nest_time: nesting time (ms) during the post-fertilization parenting stage
- parent0_fan_s: nest fanning time (seconds) during the post-fertilization parenting stage
- parent0_nest_min: nesting time (minutes) during the post-fertilization parenting stage
- nestFlatness: Score for nest flatness (described in paper)
- nestPopulation: Population of the focal fish
- nestOpening: Score for the nest opening (described in paper)
- nestSand: Score for the amount of sand on the nest (described in paper)
- nestAlgae_mass: Score (1 or 5) for if the nest was located in the main algae mass (described in paper)
- nestscale_omsf: combined opening, flatness, sand, and algae_mass score
- nestscale_osf: combined opening, sand, and flatness score
- nestscale_sf: combined sand and flatness score
- aggro_bites_Trial_1: bites towards an intruder during the first aggression trial
- aggro_fan_bouts_Trial_1: fan bouts during the first aggression trial
- aggro_glue_bouts_Trial_1: glue bouts during the first aggression trial
- aggro_leads_Trial_1: leads during the first aggression trial
- aggro_nest_bouts_Trial_1: nesting bouts during the first aggression trial
- aggro_total_orients_Trial_1: orients towards the intruder flask during the first aggression trial
- aggro_pokes_Trial_1: nest pokes during the first aggression trial
- aggro_thru_nest_Trial_1: "through-nest" behaviors during the first aggression trial
- aggro_zigzag_Trial_1: zigzags during the first aggression trial
- aggro_fan_time_Trial_1: nest fanning time during the first aggression trial
- aggro_nest_time_Trial_1: nesting time during the first aggression trial
- aggro_orient_time_Trial_1: orienting time towards the intruder flask during the first aggression trial
- aggro_bites_Trial_2: bites towards an intruder during the second aggression trial
- aggro_fan_bouts_Trial_2: fan bouts during the second aggression trial
- aggro_glue_bouts_Trial_2: glue bouts during the second aggression trial
- aggro_leads_Trial_2: leads during the second aggression trial
- aggro_nest_bouts_Trial_2: nesting bouts during the second aggression trial
- aggro_total_orients_Trial_2: orients towards the intruder flask during the second aggression trial
- aggro_pokes_Trial_2: nest pokes during the second aggression trial
- aggro_thru_nest_Trial_2: "through-nest" behaviors during the second aggression trial
- aggro_zigzag_Trial_2: zigzags during the second aggression trial
- aggro_fan_time_Trial_2: nest fanning time during the second aggression trial
- aggro_nest_time_Trial_2: nesting time during the second aggression trial
- aggro_orient_time_Trial_2: orienting time towards the intruder flask during the second aggression trial
- sample_number: tube number of the individual
- Weight: weight of the fish in grams
- Length: standard length of the fish in mm
- EggsDay0_eaten: Number of eggs found in the stomach of the fish 30 minutes after fertilization
- EggsDay0_nest: Number of eggs found in the nest of the fish 30 minutes after fertilization
- EggsDay0_outside: Number of eggs found outside of the nest 30 minutes after fertilization
- EggsClutchweight: Clutch weight in grams, as determined by subtracting the post-weight from the pre-weight of a gravid female
- nest_min_zscore: Z score averaged nesting time (min) during a courtship trial
- nest_bouts_zscore: Z score averaged nesting bouts during a courtship trial
- fan_s_zscore: Z score averaged fanning time (s) during a courtship trial
- fan_bouts_zscore: Z score averaged fanning bouts during a courtship trial
- leads_zscore: Z score averaged leads during a courtship trial
- pokes_zscore: Z score averaged nesting pokes during a courtship trial
- ave_zigzags: Averaged zigzags during courtship trials
- ave_bites: Averaged bites during courtship trials
- bites_zscore: Z score averaged bites during courtship trials
- zigzags_zscore: Z score averaged zigzags during courtship trials
- finterest_1st_trial: The level of female interest (low or high) of the first courtship trial for a male
- nest_min_1st_trial: nesting time (min) of the first courtship trial
- nest_bouts_1st_trial: nesting bouts of the first courtship trial
- fan_s_1st_trial: fanning time (s) of the first courtship trial
- fan_bouts_1st_trial: fanning bouts of the first courtship trial
- leads_1st_trial: leads of the first courtship trial
- pokes_1st_trial: nest pokes of the first courtship trial
- zigzags_1st_trial: zigzags of the first courtship trial
- bites_1st_trial: bites of the first courtship trial
-File 3: QTL_table_1.csv - This file contains the location and significance of QTL peaks across the various traits. Variables are described below.
- Context: Name of the context of the trait of interest (e.g. courtship, parenting, ...)
- Trait: Name of the trait of interest
- Chr: Name of the chromosome
- cM: Location of the marker peak in cM
- Marker: Name of the marker peak
- LOD: LOD score of the peak
- pval: P value of the peak based on permutation testing
- PVE: Percent variance explained by the QTL
- Interval_H: Location of the high end of the QTL interval, in cM
- Interval_L: Location of the low end of the QTL interval, in cM
- PVE_untransformed: Percent variance explained by the QTL when using untransformed values.
- Ref_genome_L: Location of the low end of the QTL interval, in base pairs, on the reference genome.
- Ref_genome_H: Location of the high end of the QTL interval, in base pairs, on the reference genome.
- Interval_size_Mb: Size of the QTL interval as measured in megabases.
-File 4: marker_linkage_physical_locations.csv - This file includes the names of the markers and their location (in base pairs) on the threespine stickleback v5 genome (https://stickleback.genetics.uga.edu/downloadData/). Variables are described below.
- marker: name of the marker
- linkage_map_chr: linkage group location of the marker in the linkage map
- cM: centimorgan location of the marker
- reference_assembly_chr: name of the chromosome the marker was located on based on the reference assembly
- reference_assembly_bp: location (in base pairs) of the marker based on the stickleback genome
-File 5: Genes_within_QTL.csv - This file includes all genes that fall within a QTL confidence interval identified in this study. Variables are described below.
- Trait: Which phenotypic trait the QTL is associated with
- Chr: Name of the chromosome
- Gene.Start.bp: Starting base pair of the gene based on the reference assembly
- Gene.End.bp: Ending base pair of the gene based on the reference assembly
- Ensembl.Gene.ID: Name of the gene in Ensembl format
- Associated.Gene.Name: Common name of the gene
- Ensembl.Protein.ID: Protein name in Ensembl format
-File 6: marker_sequences.fa - This file includes the sequences of each locus in standard fasta format.
