Skip to main content
Dryad

Data from: Accuracy of allele frequency estimation using pooled RNA-Seq

Cite this dataset

Konczal, Mateusz et al. (2013). Data from: Accuracy of allele frequency estimation using pooled RNA-Seq [Dataset]. Dryad. https://doi.org/10.5061/dryad.bh23t

Abstract

For non-model organisms, genome-wide information that describes functionally relevant variation may be obtained by RNA-Seq following de novo transcriptome assembly. While sequencing has become relatively inexpensive, the preparation of a large number of sequencing libraries remains prohibitively expensive for population genetic analyses of non-model species. Pooling samples may be then an attractive alternative. To test whether pooled RNA-Seq accurately predicts true allele frequencies, we analyzed the liver transcriptomes of 10 bank voles. Each sample was sequenced both as an individually barcoded library and as a part of a pool. Equal amounts of total RNA from each vole were pooled prior to mRNA selection and library construction. Reads were mapped onto the de novo assembled reference transcriptome. High-quality genotypes for individual voles, determined for 23,682 SNPs, provided information on “true” allele frequencies; allele frequencies estimated from the pool were then compared to these values. “True” frequencies and those estimated from the pool were highly correlated. Mean relative estimation error was 21% and did not depend on expression level. However, we also observed a minor effects of inter-individual variation in gene expression and allele specific gene expression influencing allele frequency estimation accuracy. Moreover we observed strong negative relationship between minor allele frequency and relative estimation error. Our results indicate that pooled RNA-Seq exhibits accuracy comparable to pooled genome resequencing, but variation in expression level between individuals should be assessed and accounted for. This should help in taking account the difference in accuracy between conservatively expressed transcripts and these which are variable in expression level.

Usage notes

Location

Poland