Original genotype data of 159 wheat samples
Data files
Mar 06, 2024 version files 19.41 MB
-
original_genotype_data.xlsb
-
README.md
Mar 06, 2024 version files 19.41 MB
-
original_genotype_data.xlsb
-
README.md
Abstract
This dataset contains all 55K SNP original genotype data from 159 wheat samples used in the study, including SNP site IDs, chromosomes and positions, allele information, and genotype information for each material at each SNP site.
README: Data of 159 original wheat genotypes
https://doi.org/10.5061/dryad.4xgxd25ht
This dataset contains genotype data of a total of 53063 SNP loci from 159 wheat samples, including 4003 SNP loci that cannot be determined on the chromosome or position.
Description of data and file structure
Each column of this original dataset includes SNP site ID, chromosome location, position on chromosome, allele A, allele B, and wheat material. When conducting quality control on the dataset, first filter out SNP sites that cannot determine the chromosome and location, and then modify the name of each column to: rs #, alleles, chrome, pos, strand, assembly #, center, protLSID, asseyLSID, panelLSID, Qccode, wheat material. Among them, "rs #", "chrome", and "pos" are SNP site IDs, chromosomes, and positions on chromosomes; The "alleles" column is formed by combining allele A and allele B and separating them with "/"; "Strand" columns are all represented by "+"; The columns "assembly #", "center", "protLSID", "asseyLSID", "panelLSID", and "Qccode" are all represented by "NA"; The wheat material remains unchanged. After modifying the original data, high-quality loci can be obtained through minimum allele frequency (MAF) filtering in VCFtools.
##