Data from: Genome-wide association analyses in the model rhizobium Ensifer meliloti
Data files
Oct 15, 2018 version files 1.93 GB
-
assembly.zip
-
bslmm.zip
-
figs_tables.zip
-
gwas.zip
-
ld_groups.zip
-
model_selection.zip
-
ncbi.zip
-
pav_calls.zip
-
pheno_data.zip
-
popgen.zip
-
README_for_assembly.assembly
-
README_for_bslmm.bslmm
-
README_for_figs_tables.figs_tables
-
README_for_gwas.gwas
-
README_for_ld_groups.ld_groups
-
README_for_model_selection.model_selection
-
README_for_ncbi.ncbi
-
README_for_pav_calls.pav_calls
-
README_for_pheno_data.pheno_data
-
README_for_popgen.popgen
-
README_for_reference_genome.reference_genome
-
README_for_snp_calls.snp_calls
-
README_for_strain_info.strain_info
-
README_for_variant_filtering_for_analysis.variant_filtering_for_analysis
-
README.txt
-
reference_genome.zip
-
snp_calls.zip
-
strain_info.zip
-
variant_filtering_for_analysis.zip
Abstract
Genome-wide association studies (GWAS) can identify genetic variants responsible for naturally occurring and quantitative phenotypic variation and therefore provide a powerful complement to approaches that rely on de novo mutations for characterizing gene function. Although bacteria should be amenable to GWAS, few GWAS have been conducted on bacteria, and the extent to which non-independence among genomic variants (e.g. linkage disequilibrium, LD) and the genetic architecture of phenotypic traits will affect GWAS performance is unclear. We apply association analyses to identify candidate genes underlying variation in 20 biochemical, growth, and symbiotic phenotypes among 153 stains of Ensifer meliloti. For 10 traits we find genotype-phenotype associations that are stronger than expected by chance, with the candidates in relatively small linkage groups, indicating that LD does not preclude resolving association candidates to relatively small genomic regions. The significant candidates show an enrichment for nucleotide polymorphisms (SNPs) over gene presence-absence variation (PAV), and for five traits, candidates are enriched in large linkage groups, a possible signature of epistasis. Many of the variants most strongly associated with symbiosis phenotypes were in genes previously known to be involved in nitrogen-fixation or nodulation. For other traits, apparently strong associations were not stronger than the range of associations detected in permuted data. In sum, our data show that GWAS in bacteria may be a powerful tool for characterizing genetic architecture and identifying genes responsible for phenotypic variation, however, careful evaluation of candidates is necessary to avoid false signals of association.