Data from: Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm
Riju, Aykkal; Chandrasekar, A.; Arunachalam, Vadivel (2008), Data from: Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm, Dryad, Dataset, https://doi.org/10.5061/dryad.157
The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies.