Data from: QTL mapping genotype and phenotype data, Vanilla x MCM5001
Data files
Aug 02, 2024 version files 2.37 MB
-
README.md
2.08 KB
-
Supplemental_data__EVM_population.csv
2.37 MB
Abstract
Pod quality and yield traits in snap bean (Phaseolus vulgaris L.) influence consumer preferences, crop adoption by farmers, and the ability of the product to be commercially competitive locally and globally. The objective of the study was to identify the quantitative trait loci (QTL) for pod quality and yield traits in a snap × dry bean recombinant inbred line (RIL) population. A total of 184 F6 RILs derived from a cross between Vanilla (snap bean) and MCM5001 (dry bean) were grown in three field sites in Kenya and one greenhouse environment in Davis, CA, USA. They were genotyped at 5,951 single nucleotide polymorphisms (SNPs), and composite interval mapping was conducted to identify QTL for 16 pod quality and yield traits, including pod wall fiber, pod string, pod size, and harvest metrics. A combined total of 44 QTL were identified in field and greenhouse trials. The QTL for pod quality were identified on chromosomes Pv01, Pv02, Pv03, Pv04, Pv06, and Pv07, and for pod yield were identified on Pv08. Co-localization of QTL was observed for pod quality and yield traits. Some identified QTL overlapped with previously mapped QTL for pod quality and yield traits, with several others identified as novel. The identified QTL can be used in future marker-assisted selection in snap bean.
This is phenotypic and genotypic data on an F6 recombinant inbred population of Phaseolus vulgaris. The parent lines are Vanilla (a snap bean) and MCM5001 (a dry bean). Sixteen phenotypic traits were measured in the population, and the lines were genotyped using genotyping by sequencing.
Data is formatted to be read into R/qtl (Broman et al., 2003; 10.1093/bioinformatics/btg112). Briefly, phenotypic data is found to the left of the column with sample names, and genotype data is found to the right of this sample name column. “-“ is used in the phenotype sections when there was missing data, or in the case of missing or homozygous data in the genotype section. Two rows above the phenotype data and sample names are left intentionally blank, as these are only for linkage group and linkage distance (cM) of the genotype data, per R/qtl formatting.
Phenotype data explanation:
Pod shape (PS): 0= round; 1= flat
Pod length (PLF): Measured in cm
Pod suture string scale (PSSS): 0-10 scale. 0= minimum string, 10=maximum string
Pod fiber scale (PFS): 0-10 scale. 0= minimum fiber, 10= maximum fiber
Pod suture string (PSS): 0= minimum string, 1=maximum string
Pod wall fiber (PWF): 0= no fiber; 1= fiber
Pod shattering (PSH): 0= no shattering, 1 = shattering
Pod weight per plant (PWPP): Measured in grams
Pods per plant (PPP): Count of pods per plant
Pod length (PL): Measured in cm
Pod diameter (PD): Measured in mm
Pod string fresh pods (PSFP): 0= no string; 1= string
Pod fiber fresh pods (PFFP): 0-2 scale. 0= minimum fiber; 2 = maximum fiber
Pod shape (PSh): 0-1 scale. 0 = round, 1= flat
Pod string dry pods (PSDP): 0-10 scale. 0= minimum string, 10=maximum string
Pod fiber dry pods (PFDP): 0-10 scale. 0= minimum, 10=maximum
See Njau et al. 2024 “QTL mapping for pod quality and yield traits in snap bean (Phaseolus vulgaris L.)”, Frontiers in Plant Sciences, for more information: https://doi.org/10.5061/dryad.s7h44j1g7
A biparental mapping population of 184 F6 recombinant inbred lines (RILs) derived from a cross between Vanilla (female parent) and MCM5001 was used for linkage mapping and QTL analysis. Vanilla snap bean is produced by the Vilmorin Company in France (https://www.vilmorinmikado.fr/haricots/vanilla) and cultivated in Kenya, mainly for export. The variety has white seeds, fine market class pods (6-9 mm in diameter), and is resistance to bean common mosaic virus (BCMV), halo blight (Pseudomonas syringae pv. phaseolicola), and rust (Uromyces appendiculatus). MCM5001 is a dry bush bean bred for resistance to BCMV and bean common mosaic necrosis virus (BCMNV) by the International Center for Tropical Agriculture (CIAT, Cali, Colombia). Its seed are brown and cream speckled. The RILs were developed through single seed descent (SSD) in an insect-free greenhouse at the University of Embu (37° 27’ E, 0° 30’ S).
Plant growth conditions
The 184 RILs and their parents were evaluated in three field sites in Kenya: (i) Kutus farm in Kirinyaga County (37° 19’ E, 0° 33’ S; 1,279 masl), (ii) Don Bosco farm in Embu County (37° 29’ E, 0° 34’ S; 1,259 masl), and (iii) Mariira farm in Murang’a County (36° 56’ E, 0° 47’ S; 1,255 masl), as well as in a greenhouse at the University of California, Davis (121° 45’ W, 38° 32’ N). In the greenhouse, seeds of the parents and RILs were sown in pots filled with 5 kg topsoil. The soils for the Kutus and Mariira farms were classified as Humic Nitisols while at Don Bosco there were Nito- Rhodic Ferralsols. The three trials were conducted during the short rain season of 2022 and supplemented with irrigation. The field trials were conducted in randomized complete block designs (RCBD) with three replications, while the greenhouse experiment was a completely randomized design (CRD) with a single replicate. In the field, each RIL and the parents were planted in a single row plot, measuring 2 m long at a spacing of 20 cm between the plants and 50 cm between the rows. The fields were plowed and harrowed to achieve a moderate tilth seedbed. Di-ammonium phosphate (18- 46-0) fertilizer was applied at a rate of 200 kg ha-1 and thoroughly mixed with soil. During flowering, the plants were top-dressed with calcium ammonium nitrate (27-0-0) at a rate of 100 kg ha-1. All cultural practices were conducted to ensure that the fields were free of pests, diseases, and weeds.
Phenotypic data collection
Eight snap bean traits were evaluated (pod wall fiber, pod string, pod diameter, pod length, pod weight per plant, pod number per plant, pod shape and pod shattering), although pod string and pod wall fiber were evaluated using more than one criterion for both the field and greenhouse conditions as shown in Table 1. Unique codes were assigned to differentiate between the data gathered in the greenhouse and the data collected in the field in Table 1. Data on pod wall fiber for dry pods were collected at the R9 stage (pod maturation; Fernández et al., 1985) while fresh pods were examined at the R8 stage (pod fill). Ten pods were sampled in the greenhouse and in the field, a total of 30 pods were sampled per site (10 from each replication). The dry pod fiber was first evaluated for presence or absence of constrictions and secondly based on a scale of 0 (no wall fiber) to 10 (full wall fiber). The fresh pods were snapped in the middle to determine the presence or absence of fibers on a scale of 0-2 (0-no fiber, 1-few fibers and 2-many fibers).
The pod string was also evaluated for both fresh and dry pods. The fresh pods that were obtained from the field were boiled in a water bath for 30 minutes at 100°C. The pod strings were gently pulled from the calyx along the adaxial suture of the length of the detached string, measured and recorded. The fresh pod string length was then calculated as a ratio of pod suture string length to total pod length (Hagerty et al., 2016). Pod diameter (PD) was measured by passing the pods through holes of a bean pod ruler manufactured by Royal Sluis®. The holes vary in diameter sizes ranging from 5 mm to 9 mm. The pod length (PL) was measured from the end of the petiole to the tip of the pod while the pod weight per plant (PWPP) was computed by dividing the total weight of the pods by the number of plants. Pods per plant (PPP) was computed by dividing the total number of pods by the number of plants.
Phenotypic data analysis
Statistical analyses on quantitative phenotypic data were conducted in SAS 9.3 (SAS Institute 2011). Normality of the data was assessed by Shapiro-Wilk test and outliers were treated accordingly before proceeding with analysis of variance (ANOVA). A combined ANOVA for the three sites was, therefore, conducted using PROC GLM for the traits based on the following statistical model:
Yijkl = μ+pi + tj + bk(i) + ptij + eijkl
where: Yijkl = Response variable; μ = Mean of the population; Q26 pi = Fixed effect of the ith site; tj = Fixed effect of the jth genotype (RILs); bk(i) = Random effect of the kth replication within ith site; ptij = Fixed effect due to the interaction between ith site and jth genotype; eijkl= Residual effect.
Quantitative data were compared between traits based on Spearman correlation coefficients and plots were generated in R (R Core Team, 2022) using the packages tidyverse (Wickham et al., 2019), corrplot (Wei and Simko, 2021), and psych (Revelle, 2024). Heritabilities and variance partitioning analyses were conducted on field data using the heritability R package (Kruijer et al., 2023). Furthermore, independent t-tests were conducted to compare relationships between qualitative and quantitative traits, while Fisher’s Exact tests were conducted to show the association between qualitative traits. The analyses were conducted in R (R Core Team, 2022).
Genotyping, linkage mapping, and QTL analysis
Single nucleotide polymorphism (SNP) genotyping was accomplished by genotyping-by-sequencing (Elshire et al., 2011; Ariani et al., 2016). DNA was extracted from greenhouse-grown seeds of the RILs and parents using Qiagen DNeasy extraction kit (Qiagen, Hilden, Germany). DNA quality was checked by NanoDrop spectrophotometer and agarose gel electrophoresis. Library prep was conducted with CviAII and 150 bp paired-end sequencing was conducted on the prepared libraries at the University of California, Davis Genome Center. The reads were aligned with the v2.1 reference genome assembly of G19833 (Goodstein et al., 2013; Schmutz et al., 2014; https://phytozome-next.jgi.doe.gov/info/Pvulgaris_v2_1).
Read demultiplexing, alignment to the common bean reference genome (G19833 v2.1 (Schmutz et al., 2014), and variant calling was conducted in NGSEP3 and NGSEP4 (Tello et al., 2019, Tello et al., 2024). Data curation was performed by removing SNPs with more than 20% missing or heterozygous calls. SNPs were only kept if they had a genotype quality (GQ) over 20, were biallelic, had a minor allele frequency > 0.25, were at least 5 bp from any other SNP, were genotyped in at least 160 of 184 population members, and were found in non-repetitive regions, as defined by Lobaton et al. (2018). Individuals were plotted by missing calls and by heterozygous calls, and outliers were eliminated from further analysis. Non-parental alleles were removed. Only SNPs that were polymorphic in both the parents and the RIL population and had minor allele frequencies (MAFs) >0.25, were used for linkage mapping. After the quality checks, 5,951 SNPs were retained for linkage map construction. Linkage mapping was conducted in Rstudio using the ASMap R package (Taylor and Butler, 2017). QTL mapping was conducted using maximum likelihood through the EM algorithm of the R/qtl package in R (Lander and Botstein, 1989; Broman et al., 2003). The genetic distances were calculated based on Kosambi mapping function. A significant LOD score threshold for QTL (LOD=3.413) was developed based on the 95th percentile of LOD scores of 1000 random permutations of the genotypic data. A total of 11 linkage groups corresponding to the 11 chromosomes were developed. The coefficient of determination (R2) was used to estimate the proportion of variation explained by a QTL (1-10^ (-(2/n)*LOD) where n is the number of individuals genotyped at the locus. All traits were initially considered as quantitative variables for QTL mapping. For the traits with no continuous distribution the lines were grouped in phenotypic classes considering the parental phenotype and verifying the SNP-trait associations by Fisher’s Exact tests. Results were compared with gene models located between flanking SNPs in v2.1 of the common bean reference genome (Schmutz et al., 2014).