Do favored cuticular hydrocarbon profiles signal fertility? Analysis of Gryllus firmus and Gryllus pennsylvanicus
Data files
Jul 22, 2025 version files 1.20 MB
-
02_Fertility.R
4 KB
-
03_Paternity.R
8.24 KB
-
Allele_Frequencies.xlsx
136.62 KB
-
CHCPCA2023.csv
243.10 KB
-
COLONY_Output.xlsx
536.89 KB
-
Fertility_Count.xlsx
18.78 KB
-
MLLabeled.csv
3.38 KB
-
Mother_Genotypes.xlsx
13.90 KB
-
Offspring_Genotype_(1).xlsx
226.53 KB
-
README.md
4.30 KB
Abstract
Previous studies suggest that the cuticular hydrocarbon profiles of Gryllus firmus and G. pennsylvanicus are sexually dimorphic: males have a relatively homogenous and simple profile, whereas females are more variable, and some individuals resemble the profile of males (“male-like” females). Previous studies have shown that males in captivity prefer females with a more male-like profile, mating with them more promptly. Here, we used both species to test whether cuticular hydrocarbons serve as signal of female fertility and whether females with a male-like cuticular hydrocarbon profile mate more often in the field. We report on the number of sires for field-caught females and demonstrate that male-like females are not more fertile and do not mate more often in the wild. We also show that the allocation of sperm does not seem to follow a fair raffle; a few males dominate the brood composition of wild-caught mated females.
This dataset contains offspring genotypes, Cuticular hydrocarbon calls for all parents and supplementary individuals, allele frequencies used in the Colony program, and results from the Colony program
Description of the data and file structure
There are five data sheets:
Allele_Frequencies.xlsx: These are allele frequencies used in the Colony analyses for each brood. The allele frequencies were calculated in CERVUS and used as input “known allele frequencies” to increase COLONY’s accuracy
COLONY_Output.xlsx: Results from Colony run. Each species was analyzed with 3 different error rates (each in a separate tab), fathers for each female brood (Female ID) are designated with numbers. The "Prob (Inc.) refers to the probability that all individuals of a given fullsib family are fullsibs - the lower the probability the higher the change that the fullsibship could be slit. The "Prob (exc.) is the probability that no other individuals are fullsibs to this family (See Colony for more information). The #offspring/father is a count of offspring assigned to a particular father. The following list (AssignedO1...) refer to the offspring_ID assigned to a particular father in each brood.
Offspring_Genotype_(1).xlsx: These are the microsatellite calls for each offspring. Each individual is represented by one row and each locus is represented by two columns. Loci names are in the first row, two columns for each (allele 1 and allele 2). Loci colors for each locus represent the fluorescent dye color used for genotyping (see paper for more information)
Mother_Genotypes.xlsx: These are the microsatellite calls for each brood mother. Each individual is represented by one row and each locus is represented by two columns. Loci names are in the first row, two columns for each (allele 1 and allele 2). Loci colors for each locus represent the fluorescent dye color used for genotyping (see paper for more information)
Fertility_Count.xlsx: A count of total offspring for each female. One tab for Gryllus firmus females and one tab for Gryllus pennsylvanicus females
CHCPCA2023.csv: Sample is the sample ID: GFF (and F) represent Gryllus firmus females, GFM represent Gryllus firmus males, GPF (and P) represent Gryllus pennsylvanicus females, GPM represent Gryllus pennsylvanicus males. Results from Masshunter Qualitative Analysis Version B.10, showing the top 20 peak areas within each sample. Peak refers to peaks observed in a particular individual chromatogram, RT is retention time (Time taken for a compound to pass through the GC column), Area is the total area under the peak and Area % is the percentage of the total area attributed to a specific peak, indicating relative abundance, Height is the maximum intensity of the peak, reflecting the compound's concentration and Width of the peaks. Max Y: Highest y-axis value of the peak, related to its intensity. Symmetry: Measure of the peak's symmetry, affecting data quality. Tailing Factor: Ratio of the peak's trailing to leading edges, assessing peak shape. FWHM (Full Width at Half Maximum) indicates how wide a peak is at its midpoint. The Reference point indicates which peak is the docosane peak. Species and Sex are indicated on the final two columns.
And scripts related to R analyses and classification of females based on the analyses
01_CHC_Alignment_Manual.R: This is the R script used for the alignment and analyses of the CHC data (CHCPCA2003.csv).
02_Fertility.R: This is the R script used for the analyses of the fertility data (Fertility_Count.xlsx)
03_Paternity.R: This is the R script used for the analyses of the Paternity data (COLONY_Output.xlsx)
MLLabeled.csv: list of females classified according to the PCA analyses (see paper for details), females with a male-like CHC profile are listed with ML.
Sharing/Access information
Data was derived from the following sources:
Microsatellite genotypes for 7 loci, Cuticular hydrocarbon collections for individual crickets using GCMS (see paper for details)
Code/Software (uploaded as "data")
The R code used in the analyses
Field collected previously mated females of two species. Females oviposit in lab. Offpspring are scored with 7 microsatellites to predict number of fathers in a brood (48 offspring per brood). We use COLONY (Jones & Wang, 2010) to calculate number of fathers per brood.
We test if females with different cuticular hydrocarbon profiles produce a different number of offspring (good genes) or if certain females mate more often in the field.