Paired human macrophage RNA sequencing data
Cite this dataset
Fan, Jiaxin et al. (2020). Paired human macrophage RNA sequencing data [Dataset]. Dryad. https://doi.org/10.5061/dryad.866t1g1nb
Allele-specific expression (ASE) analysis, which quantifies the relative expression of two alleles in a diploid individual, is a powerful tool for identifying cis-regulated gene expression variations that underlie phenotypic differences among individuals. Existing methods for gene-level ASE detection analyze one individual at a time, therefore failing to account for shared information across individuals. Failure to accommodate such shared information not only reduces power, but also makes it difficult to interpret results across individuals. However, when only RNA sequencing (RNA-seq) data are available, ASE detection across individuals is challenging because the data often include individuals that are either heterozygous or homozygous for the unobserved cis-regulatory SNP, leading to sample heterogeneity as only those heterozygous individuals are informative for ASE, whereas those homozygous individuals have balanced expression. To simultaneously model multi-individual information and account for such heterogeneity, we developed ASEP, a mixture model with subject-specific random effect to account for multi-SNP correlations within the same gene. ASEP only requires RNA-seq data, and is able to detect gene-level ASE under one condition and differential ASE between two conditions (e.g., pre- versus post- treatment). Extensive simulations demonstrated the convincing performance of ASEP under a wide range of scenarios. We applied ASEP to a human kidney RNA-seq dataset, identified ASE genes and validated our results with two published eQTL studies. We further applied ASEP to a human macrophage RNA-seq dataset, identified genes showing evidence of differential ASE between M0 and M1 macrophages, and confirmed our findings by results from cardiometabolic trait-relevant genome-wide association studies. To the best of our knowledge, ASEP is the first method for gene-level ASE detection at the population level that only requires the use of RNA-seq data. With the growing adoption of RNA-seq, we believe ASEP will be well-suited for various ASE studies for human diseases.
Raw RNA sequencing data were aligned to the hg19 human reference genome using STAR and mapping bias was removal using WASP. The genetic variants were called using mpileup and variants with base quality score lower than 20 were removed. ANNOVAR was used to annotate the genetic variants and SNPs mapped to exonic regions were kept in analyses.
National Institutes of Health Clinical Center, Award: R01HL113147