Genetic insight into a polygenic trait using a novel Genome Wide Association approach in a wild amphibian population
Data files
Mar 21, 2024 version files 159.35 MB
-
females.vcf
-
GRM_imputed.txt
-
GWAS_traits.txt
-
imputed.vcf
-
phenotypic_traits.xlsx
-
README.md
Abstract
Body size variation is central in the evolution of life history traits in amphibians, but the underlying genetic architecture of this complex trait is still largely unknown. Herein, we studied the genetic basis of body size and fecundity of the alternative morphotypes in a wild population of the Greek smooth newt (Lissotriton graecus). By combining a Genome-wide association approach with linkage disequilibrium network analysis, we were able to identify clusters of highly correlated loci thus maximizing sequence data for downstream analysis. The putatively associated variants explained 12.8% to 44.5% of the total phenotypic variation in body size and were mapped to genes with functional roles in the regulation of gene expression and cell cycle processes. Our study is the first to provide insights into the genetic basis of complex traits in newts and provides a useful tool to identify loci potentially involved in fitness related traits in small data sets from natural populations in non-model species.
README: Genetic insight into a polygenic trait using a novel Genome Wide Association approach in a wild amphibian population
https://doi.org/10.5061/dryad.f4qrfj739
We utilized a ddrad sequencing approach to explore the genetic basis of polygenic traits such as body size and fecundity on a single population of Greek smooth newt, by utilizing an association analysis in Tassel with linkage disequilibrium network analysis in LDna.
Description of the data and file structure
1.phenotypic_traits.xlx: Excel file containing phenotypic measurements for the samples used in this study.
2.females.vcf: Variant Call Format file with SNP info for 102,403 sites and 38 individuals.
3.imputed.vcf: Variant Call Format file with imputed genotypic information based on the females.vcf file
4.GRM_imputed.txt: A genetic relationship matrix generated using the imputed genotypic dataset (imputed.vcf).
5.GWAS_traits.txt: Phenotypic measurements (transformed measurements) used in the association analysis and LD (linkage disequilibrium) network analysis.
DATA-SPECIFIC INFORMATION FOR: phenotypic_traits.xlx
A. Number of individuals: 61
B. List of variables
-id: the id name of each individual
-sex: the sex of each individual, female or male
-morph: the morph of each individual, metamorphic or paedomorphic
-svl: snout to vent length in millimetre
-logsvl: log transformed snout to vent length
-weight: body weight in grams
-logw: log transformed weight
-body condition: Body weight was regressed on body size after log transformation, and the residual distances from the linear regression line were used to calculate Body Condition Index (BCI; Jakob et al., 1996).
-egg_laying: that act of egg laying, zero no eggs and one egg laying (binary variable)
-eggs: the number of eggs deposited for each individual
-hatched: the number of successfully hatched eggs for each individual
-stage_larvae: number of larvae that developed limps
-stage_juvenile: number of larvae that reach the juvenile stage
-fecundity1: squared root transformed number of laid eggs
-fecundity2: squared root transformed number of successfully hatched eggs
-fecundity3: squared root transformed number of survived larvae till juvenile stage
C. Missing cells: None
D. Empty cells: Correspond to the females that did not engage in oviposition (egg laying is zero) at the present study.
A. Number of individuals: 38
B. List of variables
-ID: the id name of each individual
-sex: the sex of each individual, female or male
-morph: the morph of each individual, meta (metamorphic) or paedo (paedomorphic)
-size: log transformed snout to vent length (mm)
-condition: Body weight was regressed on body size after log transformation, and the residual distances from the linear regression line were used to calculate Body Condition Index (BCI; Jakob et al., 1996).
-fecundity1: squared root transformed number of laid eggs
-fecundity2: squared root transformed number of successfully hatched eggs
-fecundity3: squared root transformed number of survived larvae till juvenile stage
C. Missing values are denoted by NA (data not applicable).
Methods
Phenotypic measurments were collected from female Greek smooth newts, including snout to vent length, body weight and reproductive components.
Genomic DNA was extracted from tissue samples. ddRAD libraries were produced using an IGATech custom protocol (IGA Technology Services, Udine, Italy), with minor modifications with respect to Peterson’s double digest restriction-site associated DNA preparation (Peterson et al., 2012), using SphI (5’GCATG 3’) and BamHI (5’GGATCC 3’) endonucleases. Library pools were selected on a BluePippin (Sage Science Inc. Beverly, MA, USA) setting the range to 380–500 bp. The resulting libraries were checked with both a Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA) and by Bioanalyzer DNA Assay (Agilent Technologies, Santa Clara, CA). Libraries were sequenced with 150 cycles in paired end mode on a NovaSeq 6000 Sequencing System following the manufacturer’s instructions (Illumina, San Diego, CA).
Raw reads were demultiplexed and trimmed to remove adaptors using the process_radtags utility included in Stacks v.2.0 (Catchen et al., 2013). Short reads were de novo assembled, catalogued and matched using the ustacks cstacks, sstacks and tsv2bam (for paired-end reads) utilities in Stacks. SNPs were called using gstacks which assembles and genotypes contigs. SNP filtering was done under the populations component included in Stacks.
Here we provide the vcf files used in the study along with the imputed genotyping file and the genetic relationship matrix (GRM) that were generated. We also provide files containing individual phenotypic measurments used for the association analysis in Tassel and LDna.