Long-read-based draft genome sequence of Indian black gram IPU-94-1 ‘Uttara’: Insights into disease resistance and seed storage protein genes
Data files
Jul 08, 2022 version files 976.47 MB
-
IPU-94-1_‘Uttara’_Asm_Gene_v1.gff
4.89 MB
-
IPU-94-1_‘Uttara’_denovo_v1.fasta
454.22 MB
-
IPU-94-1_‘Uttara’_genome_Ref_Asm_v1.fasta
463.52 MB
-
IPU-94-1_‘Uttara’_Proteins_v1.fasta
12.71 MB
-
IPU-94-1_‘Uttara’_Transcripts_v1.fasta
41.14 MB
-
README_file.txt
359 B
Abstract
Black gram [Vigna mungo (L.) Hepper var. mungo] [LAV1] is a warm-season legume highly prized for its protein content along with significant folate and iron proportions. To expedite the genetic enhancement of black gram, a high-quality draft genome from the center of origin of the crop is indispensable. Here, we established a draft genome sequence of an Indian black gram cultivar, ‘Uttara’ (IPU 94-1), known for its high resistance to mungbean yellow mosaic virus. Pacific Biosciences of California, Inc. (PacBio) single-molecule real-time (SMRT) and Illumina sequencing assembled a draft reference-guided assembly with a cumulative size of ~454.4 Mb, of which, 444.4 Mb was anchored on 11 pseudomolecules corresponding to 11 chromosomes. Uttara assembly denotes features of a high-quality draft genome illustrated through high N50 value (42.88 Mb), gene completeness (benchmarking universal single-copy ortholog [BUSCO] score 94.17%), and low levels of ambiguous nucleotides (N) percent (0.0005%). Gene discovery using transcript evidence predicted 28,881 protein-coding genes, from which, ~95% were functionally annotated. A global survey of genes associated with disease resistance revealed 119 nucleotide binding site–leucine rich repeat (NBS-LRR) proteins, while 23 genes encoding seed storage proteins (SSPs) were discovered in black gram. A large set of microsatellite loci were discovered for marker development in the crop. Our draft genome of an Indian black gram provides the foundational genomic resources for the improvement of important agronomic traits and ultimately will help in accelerating black gram breeding programs.
Methods
Draft Genome Sequence Vigna mungo (L.) Hepper var. mungo was generated using Pacific Biosciences Sequel II. The data were assembled to get a draft genome.
Usage notes
The .gff file requires MS Excel or a similar program. The fasta files will require .txt editor or any genome analysis program suit.