single cell TBLDA
BEEHIVE alumna Dr. Ari Gewirtz and @will_townes post preprint Expression QTLs in single-cell data that describes single-cell telescoping bimodal latent Dirichlet allocation (scTBLDA) applied to population-scale single-cell RNA-seq data to find expression QTLs.
These beautiful scRNA-seq and paired genotype data come from Jimmie Yee's Lab and include approximately 400,000 peripheral blood mononuclear cells (PBMCs) from 119 women with systemic lupus erythematosus (SLE).
We extend our method, telescoping bimodal latent Dirichlet allocation (TBLDA), that identifies covarying genotypes and gene expression values when the matching from samples to cells is not one-to-one in order to allow cell-type label agnostic discovery of eQTLs in noncomposite scRNA-seq data. In particular, we add GPU-compatibility, sparse priors, and amortization to enable fast inference on large-scale scRNA-seq data.
We find complex cell-type heterogeneity within specific immune cell-type labels.
We use linked genes and SNPs to identify 205 cis-eQTLS, 66 trans-eQTLs, and 53 cell type proportion QTLs. Our results demonstrate the ability of scTBLDA to identify genes involved in cell-type specific regulatory processes associated with SNPs in single-cell data.
We found broad replication of our findings in three scRNA-seq studies: [van der Wijst et al. 2018], [Vosa et al. 2021], and [Fairfax et al. 2012].
We found enrichment of our trans-eQTLs in enhancers using stratified LD score regression.
scTBLDA is an important step forward in methods for eQTL analysis in single-cell sequencing studies. Our approach shifts the paradigm of mapping QTLs from univariate analyses to an analysis that more accurately captures statistical complexities and biological models of regulation.
Code available for scTBLDA -- try it on your population-wide single-cell sequencing data! Feedback welcome!