Skip to main content

Genomic assembly of Botyrococcus braunii race B (Metzger et al. 1988) Ayame strain

Cite this dataset

Moore, Robert; Ball, Andrew (2021). Genomic assembly of Botyrococcus braunii race B (Metzger et al. 1988) Ayame strain [Dataset]. Dryad.


This dataset is a genomic assembly (v1.0) of DNA from a culture of Botryococcus braunii race B, Ayame strain. The strain was isolated in 1984 by Metzger et al Phytochemistry (1988), 27, 1383-1388. The site of isolation was Cote d'Ivoire, barrier lake of Ayame, 24th Feb 1984, pH5.7 water temp 30.8.  The strain was then cultured in a lab for 25 years until being selected for this genomic denovo assembly project. The strain makes biofuels consisting of methylated triterpenoids termed "botryococenes."

The project objectives and achievements were to selectively identify and assemble full length triterpenoid synthase genes, annotate them, determine the numer of paralogues in this set of genes, and to investigate their evolution, via a bioinformatic process. The annotations of the slected genes are housed elsewhere on NCBI as genbank accessions KU248134, KU248135, KU248136 and KU248137, while the rest of the assembly is documented here.

Genes in the assembly are from across the entire genome, and commensals that were present in the cultutre are also deliberately represented in the datset aswell. The lab leading the publication (Ball lab) is a bioremediation lab, and the commensals are useful remediation organisms. Further, genes upstream in the triterpenoid synthesis pathway othere than the terminal genes that were targeted, are in the datset and are of interest to the wider molecular microbiology community.


The sequencing centre was Illumina R&D Hayward CA, USA. The software used was SOAPdenovo 1.04. All other details of culturing and the annotation are contained in the FIM paper.

Briefly, detection of KU248134, KU248135, KU248136 and KU248137 (contained in this v 1.0 dataset along wih many other genomc loci) was via tBlastn at Ncbi, then positive hits were tested via Blastx of pdb at Ncbi. Mining was iterative, using each hit as a bait to the assembly, thus pulling out the entire set of paralogues. Other baits included: green algal squalene synthase fromm B. braunii, Chlamydomons reinhardtii, and Volvox carteri. Annotation was manual, and intron-exon and exon-intron boundaries were identified manually as described in the FIM paper.  Introns were crossed by means of 2kb and 4kb paired-end libnraries seuenced at Illumina. The objective and acheivement was to cross introns to link exons, rather than sequencing in full through introns. Rigorous assembly by SOAP denovo, with tight constraints confirmed that exonic regions had been reliably contigged together. Analysis was via phylogentic tree building, using four programs that each gave solid bootstrap support for a gene branching order. These phylogentic programs were TNT, FastME p-distance, UPGMA Poisson, and ProtPars. An evolutionary scenario or model was suggested for the order and drivers of the gene duplications involved, beginning from a squalene synthase gene. Processing of the genomic dataset as a whole is ongoing. The dataset in assembly v1.0 is also a suitable comparator dataset to assemblies other B.braunii genomes, of which there is at least one other to date.

Usage notes

The dataset is intended as a repository for supplementary data pertaining to a manuscript submitted to Frontiers in Microbiology in Feb 2021:

Robert B Moore, Michael Barnathan, Brian Fristensky, Yan Li, Gregory Knowles, Paul Gardner-Stephen, Angelo Bueti, Peter Anderson, Jian G Qin, and Andrew S Ball (2021 Submitted). Parsimony and distance approaches resolve a triterpenoid synthase gene tree in a green algal species tree of squalene sythases. Frontiers In Microbiology.

The user would optimally want to know what reader to view the text in, being that the text is of a total of ~500Mb and so too big for most word processors to open.

We suggest viewing in the free text editor "EditPad Lite 8" available free from: