Skip to main content
Dryad

Data from: Diversity and population structure of northern switchgrass as revealed through exome capture sequencing

Cite this dataset

Evans, Joseph et al. (2016). Data from: Diversity and population structure of northern switchgrass as revealed through exome capture sequencing [Dataset]. Dryad. https://doi.org/10.5061/dryad.nh8ph

Abstract

Switchgrass (Panicum virgatum L.) is a polyploid, perennial grass species that is native to North America, and is being developed as a future biofuels feedstock crop. Switchgrass is present primarily in two ecotypes: a northern upland ecotype composed of tetraploid and octoploid accessions, and a southern lowland ecotype composed of primarily tetraploid accessions. We employed high-coverage exome capture sequencing (~2.4 Tb) to genotype 537 individuals from 45 upland and 21 lowland populations. From these data, we identified ~27 million single nucleotide polymorphisms (SNPs), of which 1,590,653 high confidence SNPs were used in downstream analyses of diversity within and between the populations. From the 66 populations, we identified five primary population groups within the upland and lowland ecotypes, a result that was further supported through genetic distance analysis. We identified conserved, ecotype restricted non-synonymous SNPs that are predicted to impact protein function in genes that encode CONSTANS (CO) and EARLY HEADING DATE 1 (EHD1), key genes involved in flowering which may contribute to the phenotypic differences between the two ecotypes. We also identified, relative to the near-reference Kanlow population, 17,228 up-copy number variants (CNVs), 112,630 down-CNVs, and 14,430 presence/absence variants (PAV) impacting a total of 9,979 genes, including two upland-specific CNV-clusters. In total, 45,719 genes were impacted by a SNP, CNV, or a PAV across the panel providing a firm foundation to identify functional variation associated with phenotypic traits of interest for biofuel feedstock production.

Usage notes