Data from: Genomic and transcriptomic analyses reveal polygenic architecture for ecologically-important functional traits in aspen (Populus tremuloides Michx.)
Data files
Sep 25, 2023 version files 49.80 MB
Abstract
Intraspecific genetic variation in foundation species such as aspen (Populus tremuloides Michx.) shapes their impact on forest structure and function. Identifying genes underlying ecologically important traits is key to understanding that impact. Previous studies, using single-locus genome-wide association (GWA) analyses to identify candidate genes, have identified fewer genes than anticipated for highly heritable quantitative traits. Mounting evidence suggests that polygenic control of quantitative traits is largely responsible for this “missing heritability” phenomenon. Our research characterized the genetic architecture of 30 ecologically important traits using a common garden of aspen through genomic and transcriptomic analyses. A multilocus association model revealed that most traits displayed a highly polygenic architecture, with most variation explained by loci with small effects (likely below the detection levels of single-locus GWA methods). Consistent with a polygenic architecture, our single-locus GWA analyses found only 38 significant SNPs in 22 genes across 15 traits. Next, we used differential expression analysis on a subset of aspen genets with divergent concentrations of salicinoid phenolic glycosides (key defense traits). This complementary method to traditional GWA discovered 1,243 differentially expressed genes for a polygenic trait. Soft clustering analysis revealed three gene clusters (241 candidate genes) involved in secondary metabolite biosynthesis and regulation. Our work reveals that ecologically important traits governing higher-order community- and ecosystem-level attributes of a foundation forest tree species have complex underlying genetic structures and will require methods beyond traditional GWA analyses to unravel.
README: WisAsp genomic and transcriptomic data from: Genomic and transcriptomic analyses reveal polygenic architecture for ecologically-important functional traits in aspen (Populus tremuloides Michx.)
This dataset includes input files containing phenotypic, genomic, and transcriptomic data from a common garden of Populus tremuloides genets. The data were used to perform genome-wide association (GWA) analyses in PLINK (single-locus GWA) and GEMMA (multilocus GWA) and differential expression analyses in DESeq2.
Description of the data and file structure
WisAsp_Tree_Trait_BLUPs
WisAsp_Tree_Trait_BLUPs.txt
Rank-transformed BLUPs (best linear unbiased predictors) of ecologically important tree traits, formatted for GWAS analyses using PLINK and GEMMA;
The first two columns are sample sequence identifiers followed by columns of BLUP values (one column for each trait); see the metadata file for more information on the traits
WisAsp_MAF005 bed file
WisAsp_MAF005.bed
WisAsp bed file for genome-wide association analyses (single-locus, multi-trait, and multilocus) using PLINK and GEMMA
WisAsp_MAF005 bim file
WisAsp_MAF005.bim
WisAsp bim file for genome-wide association analyses (single-locus, multi-trait, and multilocus) using PLINK and GEMMA
WisAsp_MAF005 fam file
WisAsp_MAF005.fam
WisAsp bim file for genome-wide association analyses (single-locus, multi-trait, and multilocus) using PLINK and GEMMA
gemma_bslmm_MAF005 script file
gemma_bslmm_MAF005.sh
Script to run GEMMA's BSLMM model on a high through-put computing system
WisAsp_Expression_Counts input file
WisAsp_Expression_Counts.csv
Gene expression count data formatted for differential expression analysis using DESeq2
WisAsp_Expression_Sample_Table input reference file
WisAsp_Expression_Sample_Table.csv
Differential expression experimental design information formatted for differential expression analysis using DESeq2
Metadata file
Dryad_Lind-Riehl_etal_MolEcol_2023_metadata.xlsx
Metadata for WisAsp genomic and transcriptomic data files including specifics on how data were obtained; also includes a tab entitled "Tree trait file" that lists the trait column number assignment, category, names, abbreviations, and units
Sharing/Access information
Raw DNA sequence data for all samples included in the genome-wide association analyses are available through the European Nucleotide Archive under accession number PRJEB30919. Raw RNA sequence data for all samples included in the differential expression analysis are available through the National Center for Biotechnology Information’s Sequence Read Archive under accession number PRJNA851830.
Usage notes
WisAsp_Tree_Trait_BLUPs
Rank-transformed BLUPs (best linear unbiased predictors) of ecologically important tree traits, formatted for GWAS analyses
WisAsp_MAF005 bed file
WisAsp bed file for genome-wide association analyses (single-locus, multi-trait, and multilocus) using PLINK and GEMMA
WisAsp_MAF005 bim file
WisAsp bim file for genome-wide association analyses (single-locus, multi-trait, and multilocus) using PLINK and GEMMA
WisAsp_MAF005 fam file
WisAsp bim file for genome-wide association analyses (single-locus, multi-trait, and multilocus) using PLINK and GEMMA
gemma_bslmm_MAF005 script file
Script to run GEMMA's BSLMM model on a high through-put computing system
WisAsp_Expression_Counts.csv
Gene expression count data formatted for differential expression analysis
WisAsp_Expression_Sample_Table.csv
Differential expression experimental design information formatted for differential expression analysis
Dryad_Lind-Riehl_etal_MolEcol_2023_metadata.xlsx
Metadata for WisAsp genomic and transcriptomic data files