Data from: Genetic diversity and environmental adaptation in Ethiopian tef
Data files
Dec 13, 2024 version files 870.19 KB
-
Additional_Files_SuppInfo.xlsx
867.62 KB
-
README.md
2.57 KB
Abstract
Orphan crops serve as essential resources for both nutrition and income in local communities and offer potential solutions to the challenges of food security and climate vulnerability. Tef [Eragrostis tef (Zucc.)], a small-grained allotetraploid, C4 cereal mainly cultivated in Ethiopia, stands out for its adaptability to marginal conditions and high nutritional value, which holds both local and global promise. Despite its significance, tef is considered an orphan crop due to limited genetic improvement efforts, reliance on subsistence farming, and its nutritional, economic, and cultural importance. Although pre-Semitic inhabitants of Ethiopia have cultivated tef for millennia (4000-1000 BCE), the genetic and environmental drivers of local adaptation remain poorly understood. To address this, we resequenced a diverse collection of traditional tef varieties to investigate their genetic structure and identify genomic regions under environmental selection using redundancy analysis (RDA), complemented by single-site differentiation and LD-based methods. We identified 145 loci associated with abiotic environmental factors, with minimal geographic influence observed in the genetic structure of the sample population. Overall, this work contributes to the broader understanding of local adaptation and its genetic basis in tef, providing insights that support efforts to develop elite germplasms with improved environmental resilience.
README: Data from: Genetic diversity and environmental adaptation in Ethiopian tef
https://doi.org/10.5061/dryad.8kprr4xw3
Description of the data and file structure
Additional_Files_SuppInfo.xlxs
S1.1: Detailed information on tef germplasm collection, DNA extraction, and plate coordinates referenced through the Illumina NovaSeq 6000 platform.
S1.2: Geographic and environmental (WorldClim version 2.1) data associated with each sample.
S1.3: Individual population genetic assignments for all samples, including axis loadings from discriminant analysis of principal components (LD1 - LD4), principal components analysis (PC1 and PC2), and K-means cluster assignments.
S1.4: Genetic distances estimated as pairwise FST, alongside geographical and environmental distance matrices, based on 39 individual samples from 16 distinct geographic locations and using 20 WorldClim environmental variables. Only sampling locations with two or more collected varieties were included.
S1.5: Redundancy analysis model output (without correction for population structure), including identified outlier loci and correlation coefficients to each environmental variable..
S1.6: Redundancy analysis model output (with correction for population structure), including identified outlier loci and correlation coefficients to each environmental variable.
S1.7: Detailed genotype information and associated metadata for each non-synonymous site identified by the redundancy analysis.
S1.8: Genome-wide sliding window iHH12 scores calculated using Selscan v2.0.0 (Szpiech, 2021; Szpiech & Hernandez, 2014).
S1.9: Genome-wide sliding window FST values calculated using VCFtools version 0.1.16 (Danecek et al., 2011; Weir & Cockerham, 1984).
S1.10: GO enrichment analysis results, including GO terms and descriptions of biological processes, for E. tef v3 genes associated with outlier loci identified through redundancy analysis.
S1.11: GO enrichment analysis results, including GO terms and descriptions of biological processes, and their associated environmental variables from redundancy analysis.
Sharing/Access information
Links to other publicly accessible locations of the data and scripts:
Genetic materials and passport information were derived from the following sources:
- Ethiopian Biodiversity Institute through a collaborative project with the Ethiopian Institute of Agricultural Research.
Methods
The Ethiopian Biodiversity Institute provided the genetic materials for this study and granted access to passport information for environmental data curation. Bioclimatic (bio1-bio19) and altitudinal data were obtained using latitude and longitude coordinates from each site, with WorldClim version 2.1 layers at 10-minute resolution based on long-term averages for 1970-2000 (Fick & Hijmans, 2017). Researchers at the Ethiopian Institute of Agricultural Research, including Tadelech Bizuneh, Adanech Teshome, and Doni Hinsene, prepared DNA samples from young leaf samples for each greenhouse-grown tef accession. DNA concentration and quality were determined using a 1.5% agarose gel and a NanoDrop Spectrophotometer. The extracted DNA was sent to the BioFrontiers Sequencing Core at the University of Colorado Boulder for unique dual indexing 10 bp library preparation. Genomic libraries of the Ethiopian tef accessions were then sent to the Genomics and Microarray Core at the University of Colorado Anschutz Medical Campus for paired-end whole genome sequencing (2 × 150 bp at ~10× coverage) using an Illumina NovaSeq 6000 platform. Details regarding the variant calling pipeline and downstream analyses are provided in the manuscript.