Data from: A single domestication origin of adzuki bean in Japan and the evolution of domestication genes
Data files
Apr 04, 2025 version files 265.28 MB
-
adzuki_GWAS_data.zip
257.17 MB
-
README.md
3.98 KB
-
selection_sweep_scanning.xlsx
8.11 MB
Abstract
Adzuki is a significant legume representing East Asian culinary culture, yet the origin of its domestication remains debated. Using ~700 wild and cultigen accessions across Asia, we showed that the initial domestication happened 3-5 kya in central Japan during the Jomon period, followed by a range expansion into China and secondary hybridization with the Chinese wild population. We mapped, validated, and dated key genes associated with seed coat color evolution (VaPAP1 for loss of black and VaANR1 for gain of red colors). The frequency increase of seed-color-associated mutants substantially predated that of the yield-ensuring pod-non-shattering phenotype and the wild-cultigen divergence. Despite the moderate connections between VaANR1 and seed permeability, our results challenge the progressive view of domestication and support archaeobotanical evidence of early weak selection.
https://doi.org/10.5061/dryad.8w9ghx3xv
Description of the data and file structure
These data provide GWAS results (association data) for the red seed coat color, the mottled black seed coat color, and the seed water permeability traits of the adzuki bean. The genotypic data was used from the sequencing data mapped to the cultivar, Shumari, and the wild accession, Miyagi, respectively. The selection sweep scanning was performed by the analyses of genome-scale π ratio, FST, XP-EHH, and XP-CLR values.
Files and variables
File: adzuki_GWAS_data.zip
Description:
- mapped_to_Miyagi_mottled_black_seedcoat.maf_0.05.assoc.txt: GWAS results for the mottled black seed coat color. Genotypic data are from the mapping results of the wild adzuki, Miyagi.
- mapped_to_Miyagi_red_seedcoat.maf_0.05.assoc.txt: GWAS results for the red seed coat color. Genotypic data are from the mapping results of the wild adzuki, Miyagi.
- mapped_to_Miyagi_water_permeability.maf_0.05.assoc.txt: GWAS results for the water permeability of seeds. Genotypic data are from the mapping results of the wild adzuki, Miyagi.
- mapped_to_Shumari_mottled_black_seedcoat.maf_0.05.assoc.txt: GWAS results for the mottled black seed coat color. Genotypic data are from the mapping results of the cultivated adzuki, Shumari.
- mapped_to_Shumari_red_seedcoat.maf_0.05.assoc.txt: GWAS results for the red seed coat color. Genotypic data are from the mapping results of the cultivated adzuki, Shumari.
-
mapped_to_Shumari_water_permeability.maf_0.05.assoc.txt: GWAS results for the water permeability of seeds. Genotypic data are from the mapping results of the cultivated adzuki, Shumari.
These GWAS files contain the following columns:
- chr β Chromosome number where the SNP is located
- rs β SNP identifier (if available, otherwise it may be missing or a dot β.β)
- pos β Physical position of the SNP on the chromosome (in base pairs)
- n-miss β Number of missing genotype calls for this SNP
- allele1 β The minor allele used for association testing
- allele0 β The reference or major allele
- af β Allele frequency of allele1 in the sample population
- beta β Effect size of allele1 representing the estimated change in phenotype per allele
- se β Standard error of the beta estimate
- logl_h1 β Log-likelihood of the alternative hypothesis (H1), which assumes a genetic effect
- l_remle β Log-likelihood from the restricted maximum likelihood (REML) estimation
- l_mle β Log-likelihood from the maximum likelihood estimation (MLE)
- p_wald β P-value from the Wald test, used to assess the significance of the SNP-phenotype association
- pval β P-value from the likelihood ratio test (LRT) for SNP significance
- p_score β P-value from the score test for SNP significance
File: selection_sweep_scanning.xlsx
Description:
- pi_ratio: π ratio of the genome between JP_W and North_C (JP_C, CN_Cn)
- Fst: FST values of the genome between JP_W and North_C (JP_C, CN_Cn)
- XP-EHH: XP-EHH values of the genome between JP_W and North_C (JP_C, CN_Cn)
-
XP-CLR: XP-CLR values of the genome between JP_W and North_C (JP_C, CN_Cn)
These datasets contain the following columns:
- Population 1 β the first testing population name
- Population 2 β the second testing population name
- Chromosome β chromosome ID
- Position_start β starting nucleotide position of the chromosome
- Position_end β the end postion of the nucleotides in the chromosome
- π ratio β π values from Population 1 divided by π values from Population 2
- FST β FST values of the region
- XP-EHH β XP-EHH values of the region
- XP-CLR β XP-CLR values of the region
Adzuki accessions of 327 wild and cultigen accessions across Asia were collected/obtained from the National Agriculture and Food Research Organization (NARO) in Japan and the germplasm center of the Taiwan Agricultural Research Institute (TARI) in Taiwan. These 327 accessions were re-sequenced by illumina whole genome sequencing (WGS) and mapped to the reference genomes, Shumari or Miyagi. GWAS was performed using the dataset with GEMMA. For the selection-sweep scanning, pure populations (JP_W, JP_C, CN_Cn) were obtained from the dataset, and the vcf with the non-variant and bi-allele sites was used in the analyses. Pi and FST values were calculated by PIXY with windows size = 10,000. For XP-CLR analysis, the dataset was analyzed using an xpclr script. The window size and step size were set to 10 kb and 5 kb, respectively. For XP-EHH analysis, the iHS values were first calculated for each population of interest by the R package, rehh. The window size was set to 10 kb. The ies2xpehh function was then used to calculate the XP-EHH values in each window between the two interested groups.