Data from: Natural selection drives emergent genetic homogeneity in a century-scale experiment with barley
Data files
Jul 11, 2024 version files 3.30 GB
-
afs_for_sim.txt.zip
15.01 MB
-
CCII_PARENTS_AND_EXOME.vcf.gz
1.40 GB
-
FILTERED_PARENTAL_SNPS.vcf.gz
1.58 GB
-
FINAL_AFS.txt.gz
158.59 MB
-
GBS_PARENTS_PROGENY.vcf.gz
148.08 MB
-
parents_for_sim.txt.zip
2.89 MB
-
Parents_pheno.txt.gz
1.89 KB
-
Prog_pheno.txt.gz
35.64 KB
-
README.md
2.16 KB
Abstract
Direct observation is central to our understanding of the process of adaptation, but evolution is rarely documented in a large, multicellular organism for more than a few generations. Here, we observe genetic and phenotypic evolution across a century-scale competition experiment, barley composite cross II (CCII). CCII was founded in 1929 with tens of thousands of unique genotypes and has been adapted to local conditions in Davis, CA, USA for 58 generations. We find that natural selection has massively reduced genetic diversity leading to a single clonal lineage constituting most of the population by generation F50. Selection favored alleles originating from similar climates to that of Davis, and targeted genes regulating reproductive development, including some of the most well-characterized barley diversification loci, Vrs1, HvCEN, and Ppd-H1. We chronicle the dynamic evolution of reproductive timing in the population and uncover how parallel molecular pathways are targeted by stabilizing selection to optimize this trait. Our findings point to selection as the predominant force shaping genomic variation in one of the world’s oldest ongoing biological experiments.
https://doi.org/10.5061/dryad.z34tmpgm8
Genetic datasets from the sequencing of composite cross II
Description of the data and file structure
-Genotype datasets
CCII_PARENTS_AND_EXOME.vcf.gz
Genotype file for the merged CCII parents and Exome sequencing panel (Russell et al. 2016, Nature Genetics)
FILTERED_PARENTAL_SNPS.vcf
CCII parent SNP calls
FINAL_AFS.txt.gz
Allele counts in CCII progeny pools for each SNP. Columns are in the following order.
Chromosome
Position
Reference_allele
Alternate_allele
Reference_allele_count_Parents
Alternate_allele_count_Parents
Reference_allele_count_F18
Alternate_allele_count_F18
Reference_allele_count_F28
Alternate_allele_count_F28
Reference_allele_count_F58
Alternate_allele_count_F58
SNP_EFF_ANNOTATION
SNP_EFF_EFFECT
SNP_EFF_NEAREST_GENE
GBS_PARENTS_PROGENY.vcf.gz
Inidividual GBS genotype calls for parents and progeny. Pedigree numbers are as follows. Generation numbers of the CCII are indicated as F**
F18: 1_####
F28: 2_####
F58: 7_####
-Phenotypic datasets for the CCIIselect
Columns are as follows:
Genotype name
rows (Count of seed rows)
tiller_number (Count)
spikes mature_spikes (Count)
immature_spikes (Count)
X100_seed_mass (g)
seed_mass (g)
seed_estimate (Count)
days_to_awn_emergence (Count)
days_to_heading (Count)
plant_height (cm)
spike_length (cm)
spike_width (cm)
awn_length (cm)
days_to_heading_2017 (Count)
Parents_pheno.txt.gz
Trait measurements from greenhouse grow outs of the parents of CCII
Prog_pheno.txt.gz
Trait measurements from greenhouse grow outs of progeny selections of CCII. Pedigree numbers are the same as in the GBS dataset.
-The following datasets are derived from the whole genome datasets above and were the specific input files for the whole genome simulations.
afs_for_sim.txt.gz
Allele frequency data used for input for the simulations
parents_for_sim.txt.gz
Input data for sites with no missing data to simulate the CCII