Data from: Quantifying the phenome-wide response to sex-specific selection in Drosophila melanogaster
Data files
Feb 19, 2025 version files 272.97 MB
-
all_dmel_genes.csv
1.30 MB
-
all.dgrp.phenos_scaled.csv
17.73 MB
-
all.dgrp.phenos_unscaled.csv
15.74 MB
-
all.line.mean.phenos_metadata.csv
323 KB
-
all.line.mean.phenos.csv
5.29 MB
-
dgrp_phenos_calculated_from_raw_data_meta_data.csv
370.44 KB
-
dgrp_phenos_calculated_from_raw_data.csv
3.99 MB
-
dgrp.array.exp.female.txt
112.97 MB
-
dgrp.array.exp.male.txt
113.27 MB
-
gene_anntotations.csv
1.28 MB
-
meta_data_for_all_traits.csv
697.08 KB
-
README.md
4.99 KB
-
trait_names.rds
30.07 KB
Abstract
In species with separate sexes, selection on males causes evolutionary change in female traits values (and vice versa) via genetic correlations, which has far-reaching consequences for adaptation. Here, we utilise a sex-specific form of Robertson’s Secondary Theorem of Natural Selection to estimate the expected response to selection for 474 organismal-level traits and ~28,000 gene expression traits measured in the Drosophila Genetic Reference Panel (DGRP). Across organismal-level traits, selection acting on males produced a larger predicted evolutionary response than did selection acting on females, even for female traits; while for transcriptome traits selection on each sex produced a roughly equal average evolutionary response. For most traits, selection on males and females was predicted to move average trait values in the same direction, though for some traits, selection on one sex increased trait values while selection on the other sex decreased them, implying intralocus sexual conflict. Our results provide support for the hypothesis that males experience stronger selection than females, potentially accelerating adaptation in females. Furthermore, sex-opposite responses to selection appear to exist for only a small proportion of traits, consistent with observations that the inter-sex genetic correlation for fitness is positive but less than one in most populations so far studied.
https://doi.org/10.5061/dryad.2v6wwpzzp
Description of the data and file structure
Data concern a vast range of quantitative traits measured across the Drosophila genetic reference panel. With these data you can run all analyses presented in the associated manuscript.
If you would like to use the dataset for your own analysis, download meta_data_for_all_traits.csv and either all.dgrp.phenos_unscaled.csv or all.dgrp.phenos_scaled.csv depending on whether you want to use traits expressed on their original scale or on a standardised scale (mean = 0, SD = 1). Read this to see how we collated the dataset and conducted quality control. Read this to see how we further cleaned this dataset in preparation for our specific analyses.
Files and variables
File: meta_data_for_all_traits.csv
Variables
- Trait: the name of the trait.
- Lines measured: the number of DGRP lines the trait was measured in.
- Sex: traits were measured female, males or with pooled sexes (these are removed for analysis).
- Life_stage: traits were measured in juveniles (larvae) or adults.
- Trait guild: a loose categorisation of the trait, used to aid visualisation.
- Trait description: a short summary of the trait and how it was measured.
- Reference: the study that collected the trait data.
File: all.dgrp.phenos_unscaled.csv
Variables
- line: the DGRP line number that was phenotyped.
- Trait: the trait that was measured.
- trait_value: the mean value of the trait, in raw units.
- Reference: the study that collected the trait data.
File: all.dgrp.phenos_scaled.csv
Description: identical to all.dgrp.phenos_unscaled.csv, except that trait values are standardised to have mean = 0 and standard deviation = 1.
File: dgrp_phenos_calculated_from_raw_data.csv
Description: trait values that we calculated from individual level data, when line means weren't available. All variables have been detailed above. See this page for all modelling details. If you want to re-run the models, the data can be found here.
File: trait_names.rds
Description: the nice trait names come from here. Used in Figures and Tables.
File: dgrp_phenos_calculated_from_raw_data_meta_data.csv
Description: meta-data for the traits that we needed to find line means for. All variables have been described above.
File: all.line.mean.phenos_metadata.csv
Description: all the meta-data from the studies that provide line means. Everything has already been described. Included so that the code runs.
File: gene_anntotations.csv
Description: gene annotation data downloaded from GenBank using the org.Dm.eg.db R package provided by BiocManager.
Variables
- FBID: flybase ID.
- gene_name: the name of the gene.
- gene_symbol: an abbreviated form of the name.
- entrez_id: the ID given to the gene in the NCBI's database for gene-specific information. Entrez is the a keyword-searching program used.
- chromosome: the chromosome the gene is found on.
File: all_dmel_genes.csv
Description: GenBank annotation of the D. melanogaster transcriptome. All variables are as per gene\_anntotations.csv.
File: all.line.mean.phenos.csv
Description: all the line mean data. Everything has already been described. Included so that the code runs.
File: dgrp.array.exp.female.txt
Description: gene names, DGRP line IDs with a 1 or 2 prefix indicating whether this was the first or second measurement and the mean, whole-body expression for the gene. Measured in adult females. Data comes from Huang et al. (2015).
File: dgrp.array.exp.male.txt
Description: gene names, DGRP line IDs with a 1 or 2 prefix indicating whether this was the first or second measurement and the mean, whole-body expression for the gene. Measured in adult males. Data comes from Huang et al. (2015).
Code/software
All analyses were run in R. A detailed description of the code is available here: https://tomkeaney.github.io/Secondary_theorem_separate_sexes/About.html.
Access information
Other publicly accessible locations of the data:
We performed a forward citation search of Google Scholar for articles that had cited the original DGRP paper (Mackay et al., 2012; this study introduced the resource and is the common citation among those that use it) as of January 2022, to obtain line mean estimates and associated meta-data for quantitative traits that have been measured in the DGRP (Figure 1). We supplemented our search by including all articles cited in an influential review of the DGRP (Mackay & Huang, 2018) and all datasets included on the DGRP2 web application (http://dgrp2.gnets.ncsu.edu/; Mackay et al., 2012; Huang et al., 2014). In total, we identified 126 studies that reported line means or raw data for 38,411 phenotypic traits. 19,259 of these were female traits and 19,081 male traits (the remaining 71 were estimated from mixed sex groups). 36,280 of these were reported by a single study, which measured whole-body expression level for many of the known genes in D. melanogaster (Huang et al., 2015). We restricted our analyses to genes that can be matched to the GenBank annotation of the D. melanogaster transcriptome (Carlson, 2019) and excluded any genes on the Y chromosome, leaving 28,536 gene expression traits. Due to the size and unique nature of the transcriptome relative to the rest of the dataset, we separated our analysis of the phenome into a transcriptome component and an organismal-level phenotype component (sensu Huang et al., 2015). For analysis, we discarded traits measured in mixed sex groups or individuals of unknown sex, traits measured in fewer than 80 DGRP lines, and traits that were not measured in homozygous DGRP lines. Two studies that reported data for hundreds of traits related to the microbiome (Everett et al., 2020) and metabolome (Jin et al., 2020) were also removed from the dataset; whilst potentially interesting, we chose to focus on organismal-level and gene expression traits, as these have previously been implicated with sexual antagonism (Cox & Calsbeek, 2009; Innocenti & Morrow, 2010; Wong & Holman, 2023). We retained data from 76 studies, with line means for 28,536 gene expression traits (14,268 per sex) and 474 organismal-level traits, 232 and 242 of which were measured in females and males, respectively. For the organismal-level traits we include two versions of the dataset: one with trait values expressed in their original units and another where for each trait and sex, line means were standardised to µ = 0 and = 1.
If you wish to use the combined DGRP dataset in your own research, we recommend using `all.dgrp.phenos_unscaled.csv` or `all.dgrp.phenos_scaled.csv`, in conjunction with `meta_data_for_all_traits.csv`. If you wish to download the dataset we used for our organismal trait analysis then you'll need to filter the data as mentioned in the above paragraph. See the Selecting data appropriate for analysis subheading in this report https://tomkeaney.github.io/Secondary_theorem_separate_sexes/Main_analysis.html for instructions and R code for pruning the dataset. Transcriptome data is also provided in this Dryad repository. All other datasets are only inlcuded so that all code can be run. If you want to recompute line means from the individual level data, these data can be found in the `data/data_collation/input/Raw_data_files` folder here: https://github.com/tomkeaney/Secondary_theorem_separate_sexes/tree/main.
