Publications

Preprints

B Zhang, S Nyquist, A Jones, B Engelhardt, D Li (2024). Contrastive linear regression [arXiv]

S Jena*, A Verma*, BE Engelhardt (2023). Answering open questions in biology using spatial genomics and structured methods [arXiv]

MM Zhang, GW Gundersen, BE Engelhardt (2023). Bayesian nonlinear latent variable modeling via random Fourier features [arXiv]

A Mandyam, D Li, D Cai, A Jones, BE Engelhardt (2023). Kernel density Bayesian inverse reinforcement learning. [arXiv]

AD Gewirtz, FW Townes, BE Engelhardt (2022). Expression QTLs in single-cell sequencing data. [bioRxiv]

A Jones, G Gundersen, BE Engelhardt (2022). Linking histology and molecular state across human tissues. [bioRxiv]

D Li*, A Jones*, S Banerjee, BE Engelhardt. Multi-group Gaussian processes. [arXiv] [Code]

IN Grabski, R de Vito, BE Engelhardt. Bayesian ordinal quantile regression with a partially collapsed Gibbs sampler. [arXiv] [Code]

A Verma, BE Engelhardt. A Bayesian nonparametric semi-supervised model for integration of multiple single-cell experiments. [bioRxiv] [Code]

R de Vito, IN Grabski, D Aguiar, LM Schneper, A Verma, J Castillo Fernandez, C Mitchell, JT Bell, S McLanahan, DA Notterman, BE Engelhardt. Differentially methylated regions and methylation QTLs for teen depression and early puberty in the Fragile Families Child Wellbeing Study. [bioRxiv]

B Dumitrascu, K Feng, BE Engelhardt. GT-TS: Experimental design for maximizing cell type discovery in single-cell data. [bioRxiv] [Talk]

AJB Chaney, A Verma, Y Lee, BE Engelhardt. Nonparametric deconvolution models. [arXiv] [Code]

C Gao, CD Brown, BE Engelhardt. A latent factor model with a mixture of sparse and dense factors to model gene expression data affected by technical and biological covariates. [arXiv]

Publications

A Mandyam, M Joerke, BE Engelhardt, E Brunskill (2024). Adaptive interventions with user-defined goals for health behavior change (Conference on Heath Inference and Learning, CHIL) [arXiv]

D Li*, A Jones*, BE Engelhardt. Probabilistic contrastive principal component analysis. Annals of Applied Statistics (AOAS; accepted). [arXiv] [Code]

A Jones, D Cai, D Li, BE Engelhardt (2023). Optimizing the design of spatial genomic studies. Nature Communications (accepted). [bioRxiv] [Code] [Post]

A Mandyam, J Yao, A Jones, K Laudanski, BE Engelhardt (2023). Compositional Q-learning for electrolyte repletion with imbalanced patient sub-populations. Machine Learning for Health (ML4H). Honorable Mention, Proceedings Track. [PDF]

J Yao, F Doshi-Velez, BE Engelhardt (2023). Inverse reinforcement learning with multiple planning horizons (NeurIPS Generalization in Planning Workshop). [PDF]

A Jones, FW Townes, D Li, BE Engelhardt (2023). Alignment of spatial genomics and histology data using deep Gaussian processes. Nature Methods doi.org/10.1038/s41592-023-01972-2 [PDF] [bioRxiv] [Code] [Post]

FW Townes, BE Engelhardt (2023). Nonnegative spatial factorization for spatial genomics data. Nature Methods doi.org/10.1038/s41592-022-01687-w [PDF] [arXiv] [Code]

T Fitzgerald, A Jones, BE Engelhardt (2022). A Poisson reduced-rank regression model for association mapping in sequencing data. BMC Bioinformatics 23(1):529. [PDF] [Code][Post]

L Okamoto, A Jones, A Verma, BE Engelhardt (2022) Spatially-aware dimension reduction of transcriptomics data. NeurIPS Workshop on Learning Meaningful Representations of Life. [PDF]

AD Gewirtz, FW Townes, BE Engelhardt (2022). Telescoping bimodal latent Dirichlet allocation to identify expression QTLs across tissues. Life Science Alliance 5(12):e202101297. [PDF] [Code] [Post]

SG Jena, AG Goglia, BE Engelhardt (2022). Towards ‘end-to-end’ analysis and understanding of biological timecourse data. Biochemical Journal 479(11): 1257–1263. [PDF]

G Martinet, A Strzalkowski, BE Engelhardt (2022). Variance minimization in the Wasserstein space for invariant causal prediction. Artificial Intelligence and Statistics (AISTATS). [PDF] [Code] [Post]

A Jones, FW Townes, D Li, BE Engelhardt (2022). Contrastive latent variable modeling with application to case-control sequencing experiments. Annals of Applied Statistics (AOAS). [PDF] [arXiv] [Code]

S Cui, EC Yoo, D Li, K Laudanski, BE Engelhardt (2022). Hierarchical Gaussian processes and mixtures of experts to model Covid-19 patient trajectories. Proceedings of the Pacific Symposium on Biocomputing (PSB). [PDF] [Code]

J Ash*, G Darnell*, D Munro*, BE Engelhardt (2021). Joint analysis of gene expression levels and histological images identifies genes associated with tissue morphology. Nature Communications 12(1609). [PDF] [Code]

B Dumitrascu, S Villar, DG Mixon, BE Engelhardt (2021). Optimal marker gene selection for cell type discrimination in single cell analyses. Nature Communications 12(1186). [PDF] [Code]

J Lu, B Dumitrascu, IC McDowell, AK Barrera, SM Leichter, TE Reddy, BE Engelhardt (2021). Causal network inference from gene transcriptional time series response to glucocorticoids. PLoS Computational Biology 17(1):e1008223. [PDF] [Code]

GW Gundersen, D Cai, C Zhou, BE Engelhardt, RP Adams (2021). Active multi-fidelity Bayesian online changepoint detection. Uncertainty in Artifical Intelligence (UAI). [PDF] [Code]

A Wu, SA Nastase, CA Baldassano, NB Turk-Browne, KA Norman, BE Engelhardt, JW & Pillow (2021). Brain kernel: A new spatial covariance function for fMRI data. NeuroImage 15(245):118580. [PDF]

A Verma, SG Jena, DR Isakov, K Aoki, JE Toettcher, BE Engelhardt (2021). A self-exciting point process to study multi-cellular spatial signaling patterns. Proceedings of the National Academy of Sciences (PNAS) 118:e2026123118. [PDF] [Code]

A Mandyam, EC Yoo, J Soules, K Laudanski, BE Engelhardt (2021). COP-E-CAT: Cleaning and organization pipeline for EHR computational and analytic tasks. Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB). [PDF] [Code]

GW Gundersen, MW Zhang, BE Engelhardt (2021). Latent variable modeling with random features. Artificial Intelligence and Statistics (AISTATS). [PDF] [Code]

N Prasad, BE Engelhardt, F Doshi-Velez (2020). Defining admissible rewards for high confidence policy evaluation. ACM Conference on Health, Inference, and Learning (CHIL). [PDF]

L-F Cheng, B Dumitrascu, MM Zhang, C Chivers, ME Draugelis, K Li, BE Engelhardt (2020). Patient-specific effects of medication using latent force models with Gaussian processes. Artificial Intelligence and Statistics (AISTATS). [PDF]

M Salganik et al. (2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences (PNAS) 117(15):8398-8403. [PDF]

L-F Cheng, G Darnell, C Chivers, ME Draugelis, K Li, BE Engelhardt (2020). Sparse multi-output Gaussian processes for medical time series prediction. BMC Medical Informatics and Decision Making 20(152). [PDF] [Code]

F Camerlenghi*, B Dumitrascu*, F Ferrari, BE Engelhardt, S Favaro (2020). Nonparametric Bayesian multi-armed bandits for single cell experiment design. Annals of Applied Statistics (AOAS) 14(4):2003-2019. [PDF] [Code]

M Oliva, GTEx Consortium, et al. (2020). The impact of sex on gene expression across human tissues. Science 369(6509) [PDF]

GTEx Consortium. (2020). The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369:1318–1330. [PDF]

D Gill, M Arvanitis, P Carter, AHI Cordero, B Jo, V Karhunen, SC Larsson, X Li, SM Lockhart, A Mason, E Pashos, A Saha, VY Tan, V Zuber, Y Bossé, S Fahle, K Hao, T Jiang, P Joubert, AC Lunt†, WH Ouwehand, DJ Roberts, W Timens, M van den Berge, NA Watkins, A Battle, AS Butterworth, J Danesh, ED Angelantonio, BE Engelhardt, JE Peters, DD Sin, S Burgess (2020). ACE inhibition and cardiometabolic risk factors, lung ACE2 and TMPRSS2 gene expression, and plasma ACE2 levels: a Mendelian randomization study. Royal Society of Open Science 7:200958. [PDF]

A Verma, BE Engelhardt (2020). A robust nonlinear low-dimensional manifold for single cell RNA-seq data. BMC Bioinformatics 21(324). [PDF] [Code]

R Elyanow, B Dumitrascu, BE Engelhardt, BE Raphael (2020). netNMF-sc: Leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis. Genome Research, 20(2):195–204. [PDF]

L-F Cheng, N Prasad, BE Engelhardt (2019). An optimal policy for patient laboratory tests in intensive care units. Proceedings of the Pacific Symposium on Biocomputing (PSB). [PDF]

G Guan, BE Engelhardt (2019). Predicting sick patient volume in a pediatric outpatient setting using time series analysis. Proceedings of Machine Learning for Health Care (MLHC). [PDF]

G Gundersen, B Dumitrascu, BE Engelhardt (2019). End-to-end training of deep probabilistic CCA on paired biomedical observations. Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI). [PDF]

IC McDowell, D Manandhar, CM Vockley, A Schmid, TE Reddy*, BE Engelhardt* (2018). Clustering gene expression time series data using an infinite Gaussian process mixture model. PLoS Computational Biology, 14(1):e1005896. [PDF] [Code]

B Dumitrascu, G Darnell, J Ayroles, BE Engelhardt (2018). A Bayesian test to identify variance effects. Bioinformatics, bty565. [PDF] [Code]

IC McDowell, A Barrero, AM D’Ippolito, CM Vockley, LK Hong, SM Leichter, LC Bartelt, WH Majoros, L Song, A Safi, DD Koçak, CA Gersbach, AJ Hartemink, GE Crawford, BE Engelhardt, TE Reddy (2018). Glucocorticoid receptor recruits to enhancers and drives activation by motif-directed binding. Genome Research, 28(9):1272-1284. [PDF]

D Aguiar, L Cheng, B Dumitrascu, F Mordelet, AA Pai, BE Engelhardt (2018). BIISQ: Bayesian nonparametric discovery of Isoforms and Individual Specific Quantification. Nature Communications 9(1681). [PDF] [Code]

AJB Chaney, BM Stewart, BE Engelhardt (2018). How algorithmic confounding in recommendation systems increases homogeneity and decreases utility. Proceedings of the 12th ACM Conference on Recommender Systems. [PDF]

B Dumitrascu*, K Feng*, BE Engelhardt (2018). PG-TS: Improved Thompson sampling for logistic contextual bandits. Proceedings of Neural Information Processing Systems (NeurIPS). [PDF]

JD Cohen, N Daw, BE Engelhardt, U Hasson, K Li, Y Niv, KA Norman, J Pillow, PJ Ramadge, NB Turk-Browne, TL Willke (2017). Computational approaches to fMRI analysis. Nature Neuroscience 20(3):304–313. [PDF]

S Srivastava, BE Engelhardt, DB Dunson (2017). Expandable factor analysis. Biometrika 104(3):649–663.[PDF]

GTEx Consortium, A Battle*, CD Brown*, BE Engelhardt*, SM Montgomery* (2017). Genetic effects on gene expression across human tissues. Nature 550: 204–213. [PDF]

S Zhao, BE Engelhardt, S Mukherjee, DB Dunson (2017). Fast moment estimation for generalized latent Dirichlet models. Journal of the American Statistical Association (JASA) 113(524):1528–1540. [PDF]

G Darnell, S Georgiev, S Mukherjee, BE Engelhardt (2017). Adaptive randomized dimension reduction on massive data. Journal of Machine Learning Research (JMLR) 18(140):1-30. [PDF]

A Saha, Y Kim*, ADH Gewirtz*, B Jo, C Gao, IC McDowell, GTEx Consortium, BE Engelhardt*, A Battle* (2017). Co-expression networks reveal the tissue-specific regulation of transcription and splicing. Genome Research 27(11):1843–1858. [PDF]

N Prasad, L-F Cheng, C Chivers, M Draugelis, BE Engelhardt (2017). A reinforcement learning approach to weaning of mechanical ventilation in intensive care units. Proceedings of Uncertainty in Artificial Intelligence (UAI). [PDF]

G Jerfel, ME Basbug, BE Engelhardt (2017). Dynamic collaborative filtering with compound Poisson factorization. Proceedings of Artificial Intelligence and Statistics (AISTATS) 738-747. [PDF]

ME Basbug, BE Engelhardt (2016). Hierarchical compound Poisson factorization. Proceedings of the International Conference on Machine Learning (ICML) 1795–1803. [PDF]

C Gao, S Zhao, IC McDowell, CD Brown, BE Engelhardt (2016). Context-specific and differential gene co-expression networks via Bayesian biclustering models. PLoS Computational Biology 12:e1004791. [PDF]

PD Tonner, CD Darnell, BE Engelhardt, A Schmid (2016). Detecting differential growth of microbial populations with Gaussian process regression. Genome Research 27:320-333. [PDF]

S Zhao, C Gao, S Mukherjee, BE Engelhardt (2016). Bayesian group latent factor analysis with structured sparse priors. Journal of Machine Learning Research (JMLR) 17(196):1−47. [PDF]

BE Engelhardt, CD Brown (2015). Diving deeper to predict noncoding sequence function. Nature Methods (News & Views; not peer-reviewed) 12(10):925–926. [PDF]

SM van Den Berg, MH de Moor, KJ Verweij, RF Krueger, M Luciano, AA Vasquez, LK Matteson, J Derringer, T Esko, N Amin, et al. (2015). Meta-analysis of genome-wide association studies for extraversion: Findings from the Genetics of Personality Consortium. Behavior Genetics 46(2):170–182. [PDF]

W Zhang, T Spector, P Deloukas, JT Bell, BE Engelhardt (2015). Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biology, 16(1):14. [PDF]

D Mimno, DM Blei, BE Engelhardt (2015). Posterior predictive checks to quantify lack-of-fit in admixture models of latent population structure. Proceedings of the National Academy of Sciences (PNAS), 112(26):E3341–50. [PDF] [Code]

Genetics of Personality Consortium. (2015). Meta-analysis of genome-wide association studies for neuroticism, and the polygenic association with major depressive disorder. JAMA Psychiatry, 72(7):642-650. [PDF]

AB Hart, ER Gamazon, BE Engelhardt, P Sklar, AK Kähler, CM Hultman, PF Sullivan, BM Neale, SV Faraone, H de Wit, NJ Cox, A Palmer (2014). Genetic variation associated with euphorigenic effects of d-amphetamine is associated with diminished risk for schizophrenia and attention deficit hyperactivity disorder. Proceedings of the National Academy of Sciences (PNAS), 111(16):5968–73. [PDF]

LM Mangravite*, BE Engelhardt*, MW Medina, JD Smith, CD Brown, DI Chasman, BH Mecham, B Howie, H Shim, D Naidoo, Q Feng, MJ Rieder, Y Chen, JI Rotter, PM Ridker, JC Hopewell, S Parish, J Armitage, R Collins, RA Wilke, DA Nickerson, M Stephens, RM Krauss (2013). A statin-dependent QTL for GATM expression is associated with statin-induced myopathy. Nature, 502(7471):377–80. [PDF]

F Mordelet, J Horton, AJ Hartemink, BE Engelhardt, R Gordân (2013). Stability selection for regression-based models of transcription factor-DNA binding specificity. Bioinformatics, 29(13):i117–25. [PDF]

CD Brown, LM Mangravite, BE Engelhardt (2013). Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs. PLoS Genetics, 9(8):e1003649. [PDF]

KE Muratore, BE Engelhardt, JR Srouji, MI Jordan, SE Brenner, JF Kirsch (2013). Molecular function prediction for a family exhibiting evolutionary tendencies toward substrate specificity swapping: Recurrence of tyrosine aminotransferase activity in the I$\alpha$ subfamily. Proteins: Structure, Function, and Bioinformatics, 81(9):1593–1609. [PDF]

AB Hart*, BE Engelhardt*, MC Wardle, G Sokoloff, M Stephens, H de Wit, A Palmer (2012). Genome-wide association study of d-amphetamine response in healthy volunteers identifies putative associations, including cadherin 13 (CDH13). PLoS One, 7(8):e42646. [PDF]

BE Engelhardt, MI Jordan, JR Srouji, SE Brenner (2011). Genome-scale phylogenetic function annotation of large and diverse protein families. Genome Research, 21(11):1969–80. [PDF]

BE Engelhardt, M Stephens (2010). Analysis of population structure: A unifying framework and novel methods based on sparse factor analysis. PLoS Genetics, 6(9):e1001117. [PDF]

JK Pickrell, JC Marioni, AA Pai, JF Degner, BE Engelhardt, E Nkadori, JB Veyrieras, M Stephens, Y Gilad, JK Pritchard (2010). Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 464(7289):768–772. [PDF]

BE Engelhardt, MI Jordan, SE Brenner (2006). A graphical model for predicting protein molecular function. Proceedings of the 23rd International Conference on Machine Learning (ICML), 297–304. [PDF]

BE Engelhardt, MI Jordan, KE Muratore, SE Brenner (2005). Protein molecular function prediction by Bayesian phylogenomics. PLoS Computational Biology, 1(5), e45. [PDF]