Chromosome evolution and the genetic basis of agronomically important traits in greater yam

Bredeson, Jessen 1 ; Lyons, Jessica1 ; Oniyinde, Ibukun2 ; Okereke, Nneka3 ; Kolade, Olufisayo2 ; Nnabue, Ikenna 3 ; Okereke, Nneka3 ; Nwadili, Christian3 ; Hribova, Eva 4 ; Parker, Matthew5 ; Nwogha, Jeremiah3 ; Shu, Shengqiang6 ; Carlson, Joseph6 ; Kariba, Robert 7 ; Muthemba, Samuel 7 ; Knop, Katarzyna5 ; Barton, Geoffrey5 ; Sherwood, Anna 5 ; Lopez-Montes, Antonio2 ; Asiedu, Robert 2 ; Jamnadass, Ramni7 ; Muchugi, Alice7 ; Goodstein, David6 ; Egesi, Chiedozie3 ; Featherston, Jonathan8 ; Asfaw, Asrat 2 ; Simpson, Gordon9 ; Dolezel, Jaroslav 4 ; Hendre, Prasad 7 ; Van Deynze, Allen 10 ; Lava Kumar, Pullikanti2 ; Obidiegwu, Jude3 ; Bhattacharjee, Ranjana 2 ; Rokhsar, Daniel1

Published Nov 19, 2021; Updated Jan 18, 2022 on Dryad. https://doi.org/10.6078/D1DQ54

Abstract

The nutrient-rich tubers of the greater yam, Dioscorea alata L., provide food and income security for millions of people around the world. Despite its global importance, however, greater yam remains an ‘orphan crop.’ Here we address this resource gap by presenting a highly contiguous chromosome-scale genome assembly of D. alata combined with a dense genetic map derived from African breeding populations. The genome sequence reveals an ancient allotetraploidization in the Dioscorea lineage, followed by extensive genome-wide reorganization. Using our new genomic tools we find quantitative trait loci for resistance to anthracnose, a damaging fungal pathogen of yam, and several tuber quality traits. Genomic analysis of breeding lines reveals both extensive inbreeding as well as regions of extensive heterozygosity that may represent interspecific introgression during domestication. These tools and insights will enable yam breeders to unlock the potential of this staple crop and take full advantage of its adaptability to varied environments.

Fresh leaf samples were collected from the field at both IITA and NRCRI. DNAs were extracted using CTAB protocols and sent to DArT (Canberra, Australia) or Integrated Genotyping Service and Support (IGSS) at BecA-ILRI for high-density genotyping using the DArTseq platform.

Samples from IITA were collected on ice. 100 mg of leaf sample was placed in a 2.0 mL Eppendorf tube, and ground with liquid nitrogen. To remove secondary metabolites, it was washed 2–3 times by adding 1000 µl HEPES buffer [0.1 M HEPES, PVP, L-ascorbic acid, 2-mercaptoethanol, and sterile distilled water], mixing thoroughly, centrifuging at 13,000 rpm for 2 min, and decanting the supernatant. Next, 800 µl of freshly prepared CTAB buffer [1M Tris HCl pH 8, 0.5 EDTA pH 8, 5M NaCl pH 8, 1% mercaptoethanol, 3% CTAB] was added and mixed well to ensure homogenization, and samples were incubated for 30 min at 65 °C in a water bath. 600 µl of chloroform-isoamyl alcohol (24:1) was added and the samples were mixed gently and centrifuged at 13,000 rpm for 5 min. The aqueous phase was transferred into freshly labeled 1.2 mL tubes. DNA was precipitated with the addition of ⅔ vol ice-cold isopropanol, mixed by inverting, incubated at −20 °C for 1 h, and centrifuged at 13,000 rpm for 5 min. The supernatant was decanted carefully, and the DNA pellet was washed twice with 500 µl of cold 70% ethanol. The ethanol was drained completely and the pellet dried at 37 °C for 30 min. DNA was resuspended in 50 µl of low TE (1 mM Tris, 0.1 mM EDTA) and 3 µl RNase, incubated at 37 °C for 1 h, and stored at 4 °C.

Samples from NRCRI were lyophilized and ground to powder in a Qiagen TissueLyser LT for 1 min at a rate of 1500 strokes/min and transferred to 2 mL microtubes. The ground tissue was homogenized in 800 µl of CTAB buffer (100 mM Tris-HCl pH 8.0, 20 mM EDTA pH 8.0, 1.4 M NaCl, 1% polyvinyl pyrrolidone, 2% 2-mercaptoethanol, 3% CTAB), then incubated for 30 min at 65 °C. 600 μl of an equal volume of chloroform and isoamyl alcohol (24:1 vol/vol) was added to the tube and centrifuged for 10 min at 13000 rpm. The nucleic acid in the aqueous phase was precipitated out with cold isopropanol, and the pellets washed by centrifuging at 13000 rpm with 70% ethanol. The pellets were further suspended in 50 µl of sterile water and treated with 3 µl of RNAse A (20 mg/mL) for 1 h at 37 °C. Finally, the samples were stored at −20 °C until use. The DNA samples were quantified using a NanoDrop 1000 (Thermo Scientific) and their integrity assessed by agarose gel electrophoresis.

Phenotyping datasets

Yam anthracnose disease (YAD) severity scale:

1	0%, no symptoms (highly resistant)
2	1–25% (moderately resistant)
3	25–50% (resistant)
4	50–75% (susceptible)
5	>75% (highly susceptible)
-	Missing datum

YAD field assay:

Visual scoring three months after planting of TDa1401, TDa1402, TDa1403, TDa1419 and TDa1427 for years 2017 and 2018. Up to three plants per genotype scored and averaged per year. (Scaled phenotype measurements not used)

YAD detached leaf assay (DLA):

Leaf infection area measured for three ~3 month-old leaves per plant. Populations evaluated: TDa1401, TDa1402, TDa1403, TDa1419, TDa1427, TDa1506, and TDa1512. (Scaled phenotype measurements not used)

Tuber traits:

FreshWeightGrams	Tuber fresh weight (grams).
DryWeightGrams	Tuber weight after 16 hrs drying at 105 C (grams).
Oxy0Mins	Oxidative browning after 0 minutes after cutting (MAC).
Oxy30Mins	Oxidative browning after 30 MAC.
Oxy60Mins	Oxidative browning after 60 MAC.
Oxy180Mins	Oxidative browning after 180 MAC.
VisualColor	Qualitative color of tuber (white, cream, orange, purple).
L	CIELAB lightness reading. >0 = lighter; <0 = darker.
A	CIELAB red/green reading. >0 = redder; <0 = greener.
B	CIELAB yellow/blue reading. >0 = yellower; <0 = bluer.
H	Munsell (HVC) Hue reading. Basic color degree: 0–100).
V	Munsell (HVC) Value reading. >0 = lighter; 0 = dark.
C	Munsell (HVC) Chroma reading. >0 = intense color; 0 = grey.
CORM	Presence or absence of corm. 0 = Absent; 1 = Present.
CORSEP	The ability of corm to separate. 0 = No; 1 = Yes.
CORTYP	Corm type. 1 = regular; 2 = transversally elongated; 3 = branched.
TBRS	Tuber shape. 1 = spherical/round; 2 = oval; 3 = cylindrical; 5 = irregular.
TBRSZ	Tuber size. 1 = small (less than 15 cm length); 2 = medium (between 15 and 25 cm in length); 3 = big (more than 25 cm in length).
TBRST	Tuber surface texture. 1 = smooth; 2 = rough.
RTBS	Roots on tuber. 0 = no roots; 2 = Few; 3 = Many.
PRTBS	Position of roots on tuber. 1 = Lower; 2 = Middle; 3 = Upper; 4 = Entire tuber.

Missing values encoded as "-".

DArTseq genotyping datasets

Metadata columns in the file:

AlleleID	Unique identifier for the sequence in which the SNP marker occurs.
AlleleSequence	In 1 row format: the sequence of the Reference allele. In 2 rows format: the sequence of the Reference allele is in the Ref row, the sequence of the SNP allele in the SNP row.
AvgCountRef	The sum of the tag read counts for all samples, divided by the number of samples with non-zero tag read counts, for the Reference allele row.
AvgCountSnp	The sum of the tag read counts for all samples, divided by the number of samples with non-zero tag read counts, for the SNP allele row.
AvgPIC	The average of the polymorphism information content (PIC) of the Reference and SNP allele rows.
CallRate	The proportion of samples for which the genotype call is either "1" or "0", rather than "-".
FreqHets	The proportion of samples which score as heterozygous.
FreqHomRef	The proportion of samples which score as homozygous for the Reference allele.
FreqHomSnp	The proportion of samples which score as homozygous for the SNP allele.
OneRatioRef	The proportion of samples for which the genotype score is "1", in the Reference allele row.
OneRatioSnp	The proportion of samples for which the genotype score is "1", in the SNP allele row.
PICRef	The polymorphism information content (PIC) for the Reference allele row.
PICSnp	The polymorphism information content (PIC) for the SNP allele row.
RepAvg	The proportion of technical replicate assay pairs for which the marker score is consistent.
SNP	In 1 row format: contains the base position and base variant details. In 2 rows format: this column is blank in the Reference row, and contains the base position and base variant details in the SNP row.
SnpPosition	The position (zero indexed) in the sequence tag at which the defined SNP variant base occurs.
TrimmedSequence	Same as the full sequence, but with removed adapters in short marker tags.

Blast columns (each column starting with is):

AlnCnt_*	Total count of aligning markers / tags with selection criteria described below.
AlnEvalue_*	E value of the best alignment to an existing model genome.
ChromPos_*	Position(s) on contig(s) with the best alignment of marker / tag to an existing model genome.
Chrom_*	Contig(s) with the best alignment of marker / tag to an existing model genome.

Header rows:

1	Order number where sample belongs to - important for multi-orders reports.
2	DArT plate barcode.
3	Client plate barcode.
4	Well row position.
5	Well column position.
6	Sample comments.
7	Genotype name.

Genotyping calls (SNP 1-row format):

0	Reference allele homozygote.
1	SNP allele homozygote.
2	Heterozygote.
-	Double null/null allele homozygote (absence of fragment with SNP in genomic representation).

Genotyping calls (SNP 2-row format):

Each allele scored in a binary fashion. Heterozygotes are therefore scored as 1/1 (presence for both alleles/both rows).

0	Allele absent.
1	Allele present.

Genetic linkage maps

Genetic linkage maps are in PLINK MAP format: https://www.cog-genomics.org/plink/1.9/formats#map

Columns in MAP file:

Chr	Name of genomic scaffold.
Marker	Genetic marker identifier.
Genetic position	Genetic linkage group position (centiMorgans).
Genomic position	Genomic scaffold position (bp).