Pod indehiscence is a domestication and aridity resilience trait in common bean
Parker, Travis et al. (2019), Pod indehiscence is a domestication and aridity resilience trait in common bean, Dryad, Dataset, https://doi.org/10.25338/B8ZG68
- Plant domestication has strongly modified crop morphology and development. Nevertheless, many crops continue to display some atavistic characteristics that were advantageous to their wild ancestors, such as pod dehiscence (PD). Here, we provide the first comprehensive assessment of the inheritance of PD in a domesticated grain legume, common bean (Phaseolus vulgaris).
- We compared three methods to evaluate the PD phenotype and identified multiple, unlinked genetic regions controlling PD using a biparental population and two diversity panels, corresponding to the two major domestications of common bean. We subsequently assessed patterns of orthology between these loci and those controlling the trait in other species.
- Our results show that different genes were selected in each domestication and ecogeographic race. Foremost among these is a chromosome Pv03 dirigent-like gene involved in lignin biosynthesis, which shows a single base-pair substitution that is strongly associated with decreased PD and expansion of the Mesoamerican domesticates into northern Mexico, where the arid conditions promote PD.
- The base-pair mutated in certain PD-resistant lineages has been conserved across species during millions of years of plant evolution. This contrast is a consequence of the markedly different fitness landscape imposed by domestication. Environmental dependency and genetic redundancy explain the maintenance of atavistic traits under domestication. Even core domestication traits in well-domesticated species show considerable genetic variation, which can ultimately affect food security.
Materials and Methods
A recombinant inbred (RI) population (n = 238), developed from a cross between ICA Bunsi (domesticated, PD-susceptible, Middle American) and SXB 405 (domesticated, PD-resistant, Middle American), was used for QTL mapping (Assefa et al. 2013; Berny Mier y Teran et al. 2019). For association mapping, different panels were used, the individual members of which are listed in Tables S2. Two-hundred eight members of the Andean Diversity Panel (ADP, Cichy et al. 2015) and 278 members of the Middle American Diversity (MDP, Moghaddam et al. 2016) were grown and phenotyped. Sequencing was performed in a diverse panel of 90 varieties representing six species were acquired from the National Plant Germplasm System. Eighteen varieties commonly grown at UC Davis with known PD phenotypes were also genotyped. Stringless snap bean varieties were specifically excluded from the analysis to avoid the epistatic effect of the Stringless (St) locus on PD.
Pods of G12873 (wild, high dehiscence), ICA Bunsi (domesticated dry bean, dehiscence-susceptible) SXB 405 (domesticated dry bean, dehiscence-resistant), and Midas (domesticated snap bean, dehiscence-susceptible) were Vibratome-sectioned to identify morphological differences that might be associated with PD. All sectioned pods were greenhouse-grown and harvested when pods were at full size with seeds filled, at the onset of pod color change. All sections were 100 micrometers thick and made in a transverse plane perpendicular to the fibers of interest. All sections were treated with Auramine O for at least 20 minutes to stain lignified tissue (Ursache et al. 2018). Fluorescence was visualized using an Olympus microscope.
RI population cultivation and PD phenotyping
The ICA Bunsi/SXB 405 (IxS) RI population of 238 RILs was field-grown during the spring and summer of 2014. The spring planting was an un-replicated trial conducted at Coachella, California. At maturity, plots were visually evaluated for the presence or absence of PD, and the data were used as a phenotype for QTL mapping. During the summer of 2014, the RI population was grown in a replicated field trial in Davis, California. At maturity, dried non-dehiscing pods from 191 RILs were harvested from each plot; these were evaluated for susceptibility to PD by two methods. First, all pods were desiccated at 65°C for seven days, and then returned to room temperature for a minimum of seven additional days. The proportion of dehiscing pods after this process was recorded for each plot. Second, the amount of force required to induce pod fracture was measured using an Imada force measurement gauge (method modified from Dong et al., 2014). Force measurements were taken on pods that had not dehisced during the desiccation treatment. A bit mounted to the gauge was used to press the ventral side of each pod at the most apical seed, and the peak force required to cause fracture at the apical end of the pod beak was recorded. Force required for PD was normalized to account for small but significant differences between note-takers, and the standardized score was used for QTL mapping. Pods that failed to produce seeds were excluded from all phenotyping analyses.
Genomic DNA was extracted from parents and RILs of the IxS population using a modified CTAB protocol. DNA quality was confirmed using a NanoDrop spectrophotometer. The IxS population was genotyped using the Illumina Infinium II BARCBean6K_3 BeadChip (Song et al. 2015); 382 segregating SNPs were identified in the population. Primers spanning the transcribed sequence of Phvul.003G252100, also known as Phaseolus vulgaris Pod Dehiscence 1 (PvPdh1), a candidate gene underlying a major QTL identified in this study, were developed using the NCBI Primer-BLAST tool. Several differences in the genomic sequence exist between the Middle American and Andean gene pools, so a mixture of two forward primers was introduced into each PCR with a common reverse: PvPDH1 ALL Middle American Forward: CATCTCCCCCATTTTCCCCC; PvPDH1 ALL Andean Forward: CATCTCTCCCATTTTCTCCT; PvPDH1 ALL common Reverse: AACACGTGGAAGAGGAGGATT. PCR conditions for this amplification included an initial denaturation at 95°C for 180s, 38 cycles of 95°C for 30s, 51°C for 30s, and 68°C for 60s, and a final elongation step of 68°C for 300s. Another set of primers was developed to specifically improve the amplification and sequencing of Andean common beans, with the sequences: PvPDH1 Andes Forward: TTTTTCTTGTGAGCAAAATTGAGTT; PvPDH1 Andes Reverse: GCAGAGGAAAAACACGTGGA. This primer set was amplified with an initial denaturation at 95°C for 300s, 34 cycles of 95°C for 30s, 46°C for 30s, and 72°C for 70s, and a final elongation step of 72°C for 300s. PCR products were cleaned using a GeneJET PCR Purification Kit and sequenced at the UC DNA Sequencing Facility by Sanger sequencing.
Composite interval mapping was conducted using the R package R/qtl (Broman et al. 2003). Field dehiscence score, proportion dehiscing in a desiccator, and force measurements were separately used to identify PD QTLs marked by SNPs. The maximum LOD score of 1000 randomized permutations of the data was used as a significance threshold. Single QTL scans were performed using the scanone function. Multiple QTL mapping was conducted using the scantwo function in R/qtl and by running the analysis with RILs subset by genotype at the most significant marker near PvPdh1 on Pv03. QTL mapping results were based on maximum likelihood via the EM algorithm (Lander and Botstein 1989).
Validation of QTL mapping results using association mapping
Two hundred and eight accessions of the ADP were grown in Davis, CA during summer 2016. PD in the field, proportion dehiscing in a desiccator, and force required for fracture were recorded. Principal component analysis was conducted on SNP data for the population, and the results were used as covariates to account for population structure. Two hundred seventy-eight members of the MDP were phenotyped for PD by desiccation in 2017. Association mapping was conducted using GLMs in TASSEL via SNiPlay (Bradbury et al., 2007; Dereeper et al., 2011). A minor allele frequency of 0.1 was used as a threshold for SNPs, and these SNPs were evaluated for significance based on a Bonferroni-corrected alpha of 0.05. QTL regions of significance were determined as the area between the first and last significant SNP on a chromosome arm. Individual significant SNPs without significant neighbors in the same population or others were not given further consideration, as these are likely All results were visualized using the qqman R package (Turner, 2018), including the Bonferroni-corrected significance thresholds at alpha=0.05 and 0.01 were shown, along with the positions of major candidate genes.
Expression and synteny mapping
Gene expression information from a variety of tissues and developmental stages was extracted from published data (O’Rourke et al. 2014) and visualized independently using R base graphics (R Core Team, 2013). Candidate genes related to PD were identified in significant QTL intervals based on definition line terms for gene families related to PD, which were downloaded with the PhytoMine interface of Phytozome 12 (Goodstein et al. 2012). Subsequent comparisons were made using the Basic Local Alignment Search Tool (BLAST) function with known amino acid sequences from related species. Synteny comparisons between common bean and soybean (Glycine max) were made using the Legume Information System 2.0 (Rice et al. 2015); these were verified using available literature (McClean et al. 2010, Schmutz et al. 2014). The CoGe SynMap (Lyons et al. 2008) and LegumeIP 2.0 (Li et al., 2016) synteny tools were used to compare syntenic regions between Arabidopsis (Col-0, TAIR10), common bean (G19833, Pvulgaris_V1.0_218; Schmutz et al. 2014), and soybean (Williams 82, Release 1.1; Schmutz et al., 2010). A neighbor-joining tree was produced to determine the pattern of homology between a common bean candidate gene (PvPdh1), a related soybean gene (GmPDH1), and other members of the dirigent gene family in these two species. The amino acid sequence of these proteins was BLASTed against the G. max and P. vulgaris proteomes identify closely related genes. These were then compared using a multiple BLASTP to develop a distance tree based on a Grishin protein distance matrix. A fast-minimum evolution tree (Desper & Gascuel 2004) was generated based on a maximum sequence difference of 0.85.
Amino acid conservation analyses
The complete amino acid sequence of PvPdh1 from accession G19833 was compared via BLASTP against the NCBI proteome database, using a BLOSUM62 matrix for comparison and existence and extension costs of 11 and 1, respectively (Altschul et al. 2005). The COnstraint-Based multiple ALignment Tool (COBALT; Papadopoulos & Agarwala, 2007, https://www.ncbi.nlm.nih.gov/tools/cobalt/re_cobalt.cgi) was used to align the most similar proteins known among several plant taxa and identify conserved residues based on the BLASTP results. The Protein Variation Effect Analyzer (PROVEAN; Choi & Chan, 2015) v1.1.3 software tool was used to estimate the effect of mutations of interest using default settings, including a cutoff threshold of -2.5 for identifying deleterious alleles.
Validation of the role of PvPdh1 in a wider population
Genomic DNA was extracted using a modified CTAB method; amplification and Sanger sequencing of PvPdh1 were conducted as described previously. Genotypes were separated into the Andean or Middle American gene pool based on an indel in the 3’ UTR of PvPdh1. This indel consistently predicted the gene pool in varieties of known ancestry. After sequencing, Middle American varieties were divided into groups based on amino acid at position 162 of PvPdh1. The degree of dehiscence between these groups was evaluated by Student’s t-test. Pod shatter phenotype data from the Germplasm Resource Information Network (GRIN) was compared with our sequencing data for varieties acquired from NPGS.
Precipitation across the native range of Middle American beans was mapped in QGIS 2.18.19 using data from worldclim2 (Fick & Hijmans, 2017). National boundaries and coastlines were added using shapefiles available through Natural Earth (Kelso & Patterson, 2010). USGS topographical global raster data grids were also used to improve the visualization of coastlines (https://topotools.cr.usgs.gov/gmted_viewer/gmted2010_global_grids.php). Landraces genotyped by Kwak and Gepts (2009) were filtered by their ecogeographic race, those with values of 0.5 in STRUCTURE groups K6 (race Mesoamerica) and K9 (race Durango/Jalisco) were filtered for subsequent analysis. Delimited text layers were added in QGIS for varieties with latitude and longitude data that belonged to one of the ecogeographic races of interest. The average annual precipitation and elevation of the region where each landrace was collected using the “add raster values to points” function in QGIS, and the values between ecogeographic races were compared by student’s t-test.
Clif Bar Family Foundation and Lundberg Family Farms