Data from: Frequencies of house fly proto-Y chromosomes across populations are predicted by temperature heterogeneity within populations
Data files
Oct 24, 2024 version files 63.17 KB
-
README.md
5.64 KB
-
SupCode_2024_09.tar.gz
57.52 KB
Abstract
Sex chromosomes often differ between closely related species and can even be polymorphic within populations. Species with multifactorial sex determination segregate for multiple different sex determining loci within populations, making them uniquely informative of the selection pressures that drive the evolution of sex chromosomes. The house fly (Musca domestica) is a model species for studying multifactorial sex determination because male determining genes have been identified on all six of the chromosomes, which means that any chromosome can be a “proto-Y”. Natural populations of house fly also segregate for a recently derived female-determining locus, meaning house flies also have a proto-W chromosome. The different proto-Y chromosomes are distributed along latitudinal clines on multiple continents, their distributions can be explained by seasonality in temperature, and they have temperature-dependent effects on physiological and behavioral traits. It is not clear, however, how the clinal distributions interact with the effect of seasonality on the frequencies of house fly proto-Y and proto-W chromosomes across populations. To address this question, we measured the frequencies of house fly proto-Y and proto-W chromosomes across nine populations in the United States of America. We confirmed the clinal distribution along the eastern coast of North America, but it is limited to the eastern coast. In contrast, annual mean daily temperature range predicts proto-Y chromosome frequencies across the entire continent. Our results therefore suggest that temperature heterogeneity can explain the distributions of house fly proto-Y chromosomes in a way that does not depend on the cline.
https://doi.org/10.5061/dryad.15dv41p5x
Description of the data and file structure
PCR was used to genotype wild caught house flies from 9 different populations across the USA for one proto-Y and one proto-W chromosome. Population genetics simulations were used to estimate the frequencies of proto-Y and proto-W chromosomes, along with multi-chromosomal genotype frequencies, in each population. Climate features within each population were compared across populations, and then related to the frequencies of the proto-sex chromosomes.
Files and variables
Population codes and weather stations where data were sampled
state | city | NOAA station |
---|---|---|
CA | Moreno Valley | March Air Force Base |
FL | Alachua | Gainesville Regional Airport |
GA | Gillsville | Gilmer Airport |
KS | Manhattan | Manhattan |
NC | Raleigh | Raleigh Durham International Airport |
NE | Lincoln | Lincoln Municipal Airport |
PA | State College | State College |
TN | Walland | Knoxville McGhee Tyson Airport |
TX | Bryan | College Station |
File: SupCode_2024_04.R
Description: R code to run population genetic simulations and analyze climate features.
File: female-freqs.tsv
Description: Counts from PCR assays to genotype females from each population. Empty cells are 0 values. NA values are divide by 0 errors.
Column headers:
- Pop: state where population is located (see table above)
- tra: number of female flies without Md-traD
- traTotal: number of female flies without Md-traD
- traDMnotY: number of female flies with Md-traD and Mdmd, but not Y^M
- traDYM: number of female flies with Md-traD and Y^M
- traDnoM: number of female flies with Md-traD but not Mdmd
- traD: total number of females with Md-traD (sum of traDMnotY, traDYM, and traDnoM)
- Total: number of females assayed for Md-traD
- traDfreq: proportion of females with Md-traD
- Mfreq: proportion of females with Mdmd
- MfreqIntraD: proportion of females with Mdmd, conditioning on carrying Md-traD
- Myfreq: proportion of females with Y^M
- MYfreqIntraD: proportion of females with Y^M, conditioning on carrying Md-traD
File: male-freqs.tsv
Description: Counts from PCR assays to genotype males from each population. Empty cells are 0 values.
Column headers:
- Pop: state where population is located (see table above)
- NotYM: number of male flies genotyped as not carrying the Y^M chromosome
- YM: number of male flies genotyped as carrying the Y^M chromosome
- Total: total number of flies genotypes
- YMfreq: proportion of males carrying carrying the Y^M chromosome
Files: ./SupData/*-normals-annualseasonal-1991-2020-2023*.csv
Description: Annual climate data for each of the nine populations. Replace the first * with the population ID. The second * is a unique ID associated with each file.
Column headers: Only a subset of columns were used in our analysis (see headers listed below), but the full data file downloaded from the NOAA NCEI website is included. See National Centers for Environmental Information climate information website for the full list.
- ANN.TAVG.NORMAL: Annual mean average daily temperature (Tmean), in Fahrenheit but code converts to Celsius
- ANN.TMIN.NORMAL: Annual mean minimum daily temperature (Tmin), in Fahrenheit but code converts to Celsius
- ANN.TMAX.NORMAL: Annual mean maximum daily temperature (Tmax), in Fahrenheit but code converts to Celsius
- ANN.PRCP.NORMAL: Annual Precipitation (Precip), inches
Files: ./SupData/*-normals-monthly-1991-2020*.csv
Description: Monthly climate data for each of the nine populations. Replace the first * with the population ID. The second * is a unique ID associated with each file.
Column headers: Only a subset of columns were used in our analysis (see headers listed below), but the full data file downloaded from the NOAA NCEI website is included. See National Centers for Environmental Information climate information website for the full list.
- MLY.TAVG.NORMAL: Mean monthly temperature, in Fahrenheit but code converts to Celsius
- MLY.TMAX.NORMAL: Maximum monthly temperature, in Fahrenheit but code converts to Celsius
- MLY.TMIN.NORMAL: Minimum monthly temperature, in Fahrenheit but code converts to Celsius
File: SupCode_2024_04.R
Description: R code to run population genetic simulations and analyze climate features.
Code/software
Running the R code for this analysis requires installing R, along with the following packages: ggplot2, usmap, sf, sp, cowplot, and ggfortify.
Access information
Other publicly accessible locations of the data:
- NA
Data was derived from the following sources:
- Climate data from https://www.ncei.noaa.gov/access/us-climate-normals/
PCR was used to genotype wild caught house flies from populations across the USA for a proto-Y and proto-W chromosome. Population genetic simulations were used to estimate the frequencies of proto-Y and proto-W chromosomes, along with multichromosome genotype frequencies, in each population. Climate features were also compared across populations, and statistical analyses were used to test if climate features were correlated with proto-sex chromosome frequencies.