Data from: Precise colocalization of sorghum’s major chilling tolerance locus with Tannin1 due to tight linkage drag rather than antagonistic pleiotropy
Data files
Feb 05, 2025 version files 363.72 MB
-
Dryad_data.zip
363.71 MB
-
README.md
11.35 KB
Abstract
Chilling tolerance in crops can increase resilience through longer growing seasons, drought escape, and nitrogen use efficiency. In sorghum (Sorghum bicolor [L.] Moench), breeding for chilling tolerance has been stymied by coinheritance of the largest-effect chilling tolerance locus, qSbCT04.62, with the major gene underlying undesirable grain proanthocyanidins, WD40 transcriptional regulator Tannin1. To test if this coinheritance is due to antagonistic pleiotropy of Tannin1, we developed and studied near-isogenic lines (NILs) carrying chilling tolerant haplotypes at qCT04.62. Whole-genome sequencing of the NILs revealed introgressions spanning part of the qCT04.62 confidence interval, including the Tannin1 gene and an ortholog of Arabidopsis cold regulator CBF/DREB1G. Segregation pattern of grain tannin in NILs confirmed the presence of wildtype Tannin1 and the reconstitution of a functional MYB-bHLH-WD40 regulatory complex. Low-temperature germination did not differ between NILs, suggesting that Tannin1 does not modulate this component of chilling tolerance. Similarly, NILs did not differ in seedling growth rate under either of two contrasting controlled environment chilling scenarios. Finally, while the chilling tolerant parent line had notably different photosynthetic responses from the susceptible parent line – including greater non-photochemical quenching before, during, and after chilling – the NIL responses match the susceptible parent. Thus, our findings suggest that tight linkage drag, not pleiotropy, underlies the precise colocalization of Tan1 with qCT04.62 and the qCT04.62 quantitative trait nucleotide lies outside the NIL introgressions. Breaking linkage at this locus should advance chilling tolerance breeding in sorghum and the identification of a novel chilling tolerance regulator.
README: Data from: Precise colocalization of sorghum's major chilling tolerance locus with Tannin1 due to tight linkage drag rather than antagonistic pleiotropy
https://doi.org/10.5061/dryad.z34tmpgms
Summary
All data and scripts are provided in a single zipped folder. Folders are set so that scripts will run within the given folder
- Figure 1A:
- Data as well as R script used for plotting analysis are included. Data for Figure 1A contains all known chilling tolerance QTL and was downloaded from the Sorghum QTL Atlas. An R script was used to filter for relevant experiments and plot locations.
- Figure 2:
- Data as well as R script used for plotting and analysis are included. Data for figure 2 contains SNPs for chilling tolerance NIL lines (progeny) as well as Kaoliangs and BTx623 (parents) which were used for sliding window haplotype analysis
- Figure 3
- See figure 2
- Figure 5:
- Data as well as R script used for plotting and analysis are included. Data for figure 5 contains germination data for chilling tolerance NIL lines, Kaoliangs, BTx623, and industry hybrid control DKS38-16.
- Figure 6A:
- Data as well as R script used for plotting and analysis are included. Data for figure 6A includes dry weights from short chilling experiment for NILs, HKZ, and BTx623.
- Figure 6B:
- Data as well as R script used for plotting and analysis are included. Data for figure 6B includes dry weights from long chilling experiment for NILs, HKZ, and BTx623.
- Figure 7:
- Data as well as R script used for plotting and analysis are included. Data for Figure 6B includes daily photosynthetic measurements collected using a MultispeQ over a 9 day chilling experiment for NILs, HKZ, and BTx623.
- Figure S1:
- Data as well as R script used for plotting and analysis are included. Data for figure 2 contains SNPs for chilling tolerance NIL lines (progeny) as well as Kaoliangs and BTx623 (parents). Data was used for sliding window SNP density analysis
Description of the data and file structure
- Data ID reference file
- A reference file containing data ID equivalencies
- Filetype: .csv
- Location: “Dryad_data/seedid_ref.csv”
- Number of Variables (columns): 6
- Col1: Genetic line
- Col2: Seed Source
- Col3: Line ID
- Col4: Line Group
- Col5: General Group ID
- Col6: General Group
- Figure 1A:
- Locations of published QTL
- Filetype: .csv
- Location: “Dryad_data/Figure_1A/QTL_Locations_CT.csv”.
- Rows are reported QTL associated with chilling tolerance
- Number of Variables (columns): 6
- “QTL Id”: QTL identification ID as named by original researcher
- “Publication”: citation of publication for QTL discovery
- “Population”: Mapping population where QTL was discovered
- “Trait Description”: Description of trait regulated at QTL
- “LG:Start-End”: Length of the QTL in base pairs with start and end point
- “Genes Under QTL (v3.0)”: Number of genes contained in the QTL in the Sorghum BTx623 v3 genome.
- Figure 2:
- Variant genotypes for parent lines and NILs
- Filetype: .vcf
- Location: “Dryad_data/Figure_1_2_3_S1/vcf_files/all.snps.filtered.final.vcf”
- Rows are SNPs specified using BTx623 v3 genome coordinates
- Number of Variables (columns): 33
- “#CHROM”: Chromosome where specific SNP is located
- “POS”: Chromosomal position in BP where the SNP is located
- “ID”: data is unavailable
- “REF”: SNP allele in the reference genome
- “ALT”: Alternate SNP allele
- “QUAL” Quality of the SNP call as assigned by GATK gVCF
- “FILTER”: Weather SNP call passed filtering threshold for high-quality biallelic SNPs
- “FORMAT”: GT - Genotype
- “Col9:Col33”: Genotype at SNP for sequenced individual (Line group - replicate)
- Figure 3:
- See Figure 2
- Figure 5:
- Germination phenotypes
- Filetype: .csv
- Location: “Dryad_data/Figure_5/germ.data.csv”
- Row is germination data for a particular seed
- Experimental design: 12 seeds per plate, 4 plates per genotype/temperature replicate, 3 replicates for each genotype/temperature level
- Number of Variables (columns): 14
- “Plate”: Plate number (by genotype/replicate/temp) of the seed
- “Genotype”: Genotype (Line group) of the particular seed
- “Rep”: Replicate (genotype/temperature)
- “GermDay”: Day after planting which seed germinated (NA means seed remained ungerminated throughout the experimental period
- “CLn_day1_mm”: length of coleoptile in mm on first day after planting, blank means ungerminated
- “RLn_day1_mm”: length of root in mm on first day after planting, blank means ungerminated
- “CLn_day2_mm”: length of coleoptile in mm on second day after planting, blank means ungerminated
- “RLn_day2_mm”: length of root in mm on second day after planting, blank means ungerminated
- “CLn_day3_mm”: length of coleoptile in mm on third day after planting, blank means ungerminated
- “RLn_day3_mm”: length of root in mm on third day after planting, blank means ungerminated
- “CLn_day4_mm”: length of coleoptile in mm on fourth day after planting, blank means ungerminated
- “RLn_day4_mm”: length of root in mm on fourth day after planting, blank means ungerminated
- Figure 6A:
- Short chilling treatment dry weight phenotypes
- Filetype: .csv
- Location: “Dryad_data/Figure_6/short.chill.data.csv”
- Row is a specific pot in an experiment
- Number of Variables (columns): 6
- “Pot_ID”: ID # of pot in experiment
- “Genotype”: Genotype (Genetic Line) of plant in pot
- “Tray_ID”: Experimental ID of tray containing pot
- “Treatment”: Experimental treatment pot received
- “TotalWeight”: Dry weight (g) of shoot and drying bag of plant from the specific pot, NA is a pot which the plant did not germinate
- “BagWeight”: Weight of drying bag (g) containing plant from specific pot
- Figure 6B:
- Long chilling treatment dry weight phenotypes
- Filetype: .csv
- Location: “Dryad_data/Figure_6/long.chill.data.csv”
- Row is a specific pot in an experiment
- Number of Variables (columns): 10
- “Pot_ID”: ID # of pot in experiment
- “Line”: Genotype (Genetic Line) of plant in pot
- “Tray_Barcodes”: Experimental ID of tray containing pot
- “Treatment”: Experimental treatment pot received
- “PlantingDate”: Date pot was planted
- “EmergenceDate”: Data unavailable
- “DaysToEmergence”: Data unavailable
- “TotalWeight”: Dry weight (g) of shoot and drying bag of plant from the specific pot, NA is a pot which the plant did not germinate
- “DryWeight”: Dry weight of the shoot (TotalWeight - drying bag weight) (g) of the plant
- Figure 7:
- Photosynthetic phenotypes
- Filetype: .txt
- Location: “Dryad_data/Figure_7/photosynth.data.txt”
- Row is a specific measurement in the experiment
- Rows corresponding to pot #s 1, 10, 14, 28, 24, 33, 40, 52, 54, 55, 58 must be filtered out because plants did not germinate and a blank piece of paper was used to make the reading. This is done automatically by the included analysis script (Figure7.R).
- Number of Variables (columns): 36
- Multispeq automatically takes many measurements, but experimental conditions were only optimized for measurements analyzed in the experiment (PhiNPQ, PhiNO, Phi2). For this reason, do not rely on the accuracy of other measurements contained in the dataset. Accordingly, explanations are only given for the 10 variables used in the experiment (Columns 13, 14, 15, 24, 25, 26, 28, 29, 30)
- “Phi2”: Phi2 is the realized steady state efficiency of photosystem II, and is the fraction of total light energy successfully used for photosynthesis
- “PhiNO”: PhiNO is the fraction of total light energy dissipated in a non-regulated fashion
- “PhiNQP”: PhiNPQ reflects the fraction of total light energy dispersed through the non-photochemical quenching pathway
- “Day”: Day of the experimental time course when measurement was taken
- “Genotype”: Genotype (Genetic Line) of the plant which the measurement was taken
- “Measurement”: What part of the experiment the measurement was taken in reference to the treatment
- “Temperature”: The temperature condition of the plant when the measurement was taken
- “Tray”: The ID of the tray which contained the plant the specific measurement was taken on
- “Pot_ID”: The ID of the pot which contained the plant on which the measurement was taken
- Figure S1
- See Figure 2
Sharing/Access information
- Figure 1A:
Code/Software
- All scripts were written using R v4.1.2
- Upon opening, scripts can be run within the provided file structure without the need for editing
- Figure 1A:
- R code to filter data and create plot
- location: “Dryad_data/Figure_1A/CT_Location.R”
- Required Libraries: tidyr, dplyr, ggplot2
- Requires import of datafile “Dryad_data/Figure_1A/QTL_Locations_CT.csv” to function
- Figure 2:
- R code to filter, analyze and create plot
- Location: “Dryad_data/Figure_1_2_3_S1/Figure_2/Figure2_Analysis.R”
- Required Libraries: data.table, stringr, tidyverse, dplyr
- Requires import of datafile “Dryad_data/Figure_1_2_3_S1/vcf_files/all.snps.filtered.final.vcf” to function
- Figure 3:
- R code to filter, analyze and create plot
- Location: “Dryad_data/Figure_1_2_3_S1/Figure_3/Figure3_Analysis.R”
- Required Libraries: data.table, stringr, tidyverse, dplyr
- Requires import of datafile “Dryad_data/Figure_1_2_3_S1/vcf_files/all.snps.filtered.final.vcf” to function
- Figure 5:
- R code used to analyze data and create plot
- Location: “Dryad_data/Figure_5/Figure5_Analysis.R”
- Required Libraries: dplyr, ggplot2
- Requires import of datafile “Dryad_data/Figure_5/germ.data.csv” to function
- Figure 6A:
- R code used to analyze data and create plot
- Location: “Dryad_data/Figure_6/Figure6A.R”
- Required Libraries: lsmeans, multcomp, car, ggplot2
- Requires import of datafile “Dryad_data/Figure_6/short.chill.data.csv” to function
- Figure 6B:
- R code used to analyze data and create plot
- Location: “Dryad_data/Figure_6/Figure6B.R”
- Required Libraries: lsmeans, multcomp, car, ggplot2
- Requires import of datafile “Dryad_data/Figure_6/long.chill.data.csv” to function
- Figure 7:
- R code used to analyze data and create plot
- Location: “Dryad_data/Figure_7/Figure7.R”
- Required Libraries: dplyr, ggpubr, ggplot2
- Requires import of datafile “Dryad_data/Figure_7/photosynth.data.txt” to function
- Figure S1:
- R code to filter, analyze and create plot
- Location: “Dryad_data/Figure_1_2_3_S1/Figure_S1/FigureS1.R”
- Required Libraries: data.table, stringr, tidyverse, dplyr
- Requires import of datafile “Dryad_data/Figure_1_2_3_S1/vcf_files/all.snps.filtered.final.vcf” to function
Methods
Genetic analyses and plant materials
Data on published QTL was downloaded from the Sorghum QTL Atlas (Mace et al. 2019). QTL were filtered for biparental and NAM mapping studies and plotted by genomic location using custom R v4.1.2 scripts (R Core Team 2021). Three RILs from the chilling tolerant NAM BTx623 × Hong Ke Zi (PI 567946) family were used as starting material to reduce subsequent backcrossing effort (Marla et al. 2019). The RILs were then crossed to BTx623. F1 progeny were selected on two criteria: heterozygosity at the QTL of interest using a KASP marker system and visually for resemblance to BTx623, the recurrent parent. Selected progeny were then backcrossed to BTx623. Selection and backcrossing were repeated four times. Four suitable BC4F1 lines were then selected and selfed. From the segregating progeny, homozygotes for both alleles of the QTL of interest were selected, making eight total BC4F2 lines. Those eight lines were then advanced to the BC4F5 generation through single seed descent generating four pairs of NIL siblings (Marla et al. 2023).
Genomic analyses
For whole-genome resequencing of NILs, leaf tissue was collected from two-week-old seedlings and frozen at -80°C until DNA extractions. Following the manufacturer's instructions, DNA extractions were performed using Quick-DNA Plant/Seed Miniprep Kit (ZYMO, D6020). DNA was quantified using a Thermo Scientific NanoDrop 2000/2000c Spectrophotometer. Library Preparation and DNA sequencing were performed by the Kansas State University Integrated Genomics Facility (https://www.k-state.edu/igenomics/index.html). DNA was sequenced to ~1x depth on Illumina NextSeq 500 using 300 cycles and 151 paired-end chemistry.
Low-quality read sequences were trimmed using Trimmomatic v0.32 (Bolger et al. 2014), and the remaining reads were mapped to BTx623 v3.1.1 reference genome (McCormick et al. 2018) using BWA-MEM (Li 2013). Picard v2.26 MarkDuplicates was then used to merge bam files from common read groups and flag duplicate reads (2019). SNPs were then called using GATK v4.2.5.0 suite of tools, including Haplotype Caller to create gVCF files, GenomicsDBImport to create gVCF database, and GenotypeGVCF to create final VCF (GA Van der Auwera and BD O’Connor 2020). BCFtools v1.15.1 was then used to sort variants and filter for high-quality biallelic SNPs (Danecek et al. 2021). A custom script was written using R v4.1.2 to analyze genome-wide sliding windows and plot alternate allele frequencies using 10000 kb windows (R Core Team 2021). Two biological replicates were analyzed independently. Red is alternatex/alternateHKZ >= 0.2; blue is alternatex/alternateHKZ < 0.2; yellow is when a color call differs between biological replicates.
Grain tannin assays
The bleach test was performed as previously described (Waniska et al. 1992; Marla et al. 2019). Briefly, fifteen seeds from each genotype were placed in a 50 mL centrifuge tube. One mL of bleach/sodium hydroxide solution was added (3.75% NaOCl and 5% NaOH) to the seeds and left for 30 minutes. Seeds containing proanthocyanidins became dark, while non-proanthocyanidin seeds became white.
Germination assays
Four temperature treatments were used to measure the genotypic effect on low-temperature germination, increasing from 10°C to 25°C in 5° increments, with three replicates per temperature. For each replicate, twelve seeds from each genotype were placed in a 90-mm petri dish lined with filter paper and moistened with 2 mL distilled water. There were three petri dishes per genotype, totaling 36 seeds per replicate. Dishes were sealed with parafilm and placed in a dark growth chamber at the treatment temperature. Each day for four days, petri dishes were opened, visually inspected, and then documented with a photo. Photos were then scored for germination (Schneider et al. 2012) and analyzed using R v4.1.2 (R Core Team 2021). Graphs were created using ggplot2 v3.4.2 r package (Hadley Wickham 2016)
Growth assays
The experiments were carried out in controlled environment chambers (Conviron Model CMP6050, Manitoba, Canada) at the Plant Growth Facilities at Colorado State University in Fort Collins, CO. Experiment designs were created and randomized using a custom R v4.1.2 script (R Core Team 2021). Each genotype/treatment combination had six replicates. Two temperature treatments were applied in parallel, chilling and control, in discrete growth chambers. For the long temperature treatment, control is defined as 30°C/20°C day/night temperature treatment and chilling 20°C/10°C. For the short temperature treatment control is defined as 28°C/25°C day/night temperature treatment and chilling 10°C/4°C. A consistent 12h photoperiod and 700 μmol m−2 s−1 light intensity was used in both treatments.
Plants were potted in 1.5-inch Cone-tainers using Lambert LM-HP potting soil and given 3g Osmocote controlled-release fertilizer. Water was provided in excess using a bottom watering system. For the long treatment, all pots were germinated under control temperature conditions for five days. Following germination, conditions for control plants remained unchanged, while chilling conditions were applied to chilling plants. After six weeks under treatment conditions, plant shoots were harvested, dried, and analyzed for dry weight. For the short treatment, all pots were germinated under control temperature conditions and grown for approximately seven days when chilling conditions were applied to chilling plants. After three days under treatment conditions, plants were again allowed to grow at control temperatures for seven more days. Plant shoots were then harvested, dried, and analyzed for dry weight.
Photosynthetic assays
Experiment designs were created and randomized using a custom R v4.1.2 script (R Core Team 2021). Each genotype/treatment combination had six replicates. All plants were potted in 1.5-inch Cone-tainers using Lambert LM-HP potting soil and given 3g Osmocote controlled-release fertilizer. Photoperiod was a 12 h day-night cycle with transits at 6:00 am and 6:00 pm. Light intensity was 700 μmol m−2 s−1, and water was provided in excess using a bottom watering system. Seedlings were allowed to grow at an optimal temperature until large enough for accurate leaf measurements to be taken for approximately ten days. Two temperature treatments were applied consecutively over a nine-day time course, optimal (28°C/25°C) and chilling (10°C/4°C) day/night. Throughout the time course, treatment changes occurred at 5:30 am on the scheduled day. The final day of the growth phase is day one for our time course analysis. Measurements were taken each day of the time course beginning at 10:00 am. On day two, seedlings were subjected to chilling treatment until day six. From day six through day nine, seedlings were again grown at optimal temperatures. Photosynthetic components were measured using MultiSpeQ (Kuhlgert et al. 2016) and analyzed using R v4.1.2 (R Core Team 2021). Graphs were constructed using ggplot2 v3.4.2 r package (Hadley Wickham 2016).