Phenotypic variation and genome-wide association studies of main culm panicle node number, maximum node production rate, and degree-days to heading in rice
Cite this dataset
Sanchez, Darlene et al. (2022). Phenotypic variation and genome-wide association studies of main culm panicle node number, maximum node production rate, and degree-days to heading in rice [Dataset]. Dryad. https://doi.org/10.5061/dryad.4qrfj6qbs
Abstract
To understand the genetic basis of main culm panicle node number, maximum node production rate, and degree-days to heading in rice (Oryza sativa), we conducted genome-wide association studies using a diversity panel of 220 rice accessions and 854,832 SNP markers generated using genotyping-by-sequencing (GBS), with 1X coverage. The raw genotype data was filtered, selecting single nucleotide polymorphisms (SNPs) having less than 50% missing data and minimum allele frequency (MAF) >5%. After initial filtering, imputation was conducted using BEAGLE V4.0 in 1,075,302 SNP markers. After imputation, the dataset was filtered a second time by removing SNPs with less than 5% MAF and more than 5% missing data. A total of 854,832 SNPs were used in the genome-wide association analyses. The dataset representing the genotype data of 854,832 SNP markers by 220 rice accessions is presented here.
Methods
The SNP markers were generated using genotyping-by-sequencing (GBS), with 1X coverage. The raw genotype data was filtered, selecting single nucleotide polymorphisms (SNPs) having less than 50% missing data and minimum allele frequency (MAF) >5%. After initial filtering, imputation was conducted using BEAGLE V4.0 in 1,075,302 SNP markers. After imputation, the dataset was filtered a second time by removing SNPs with less than 5% MAF and more than 5% missing data. A total of 854,832 SNPs were used in the genome-wide association analyses.
Funding
Texas A&M University
Texas Rice Research Foundation