Title: “Stator: High order expression dependencies finely resolve cryptic states and subtypes in single cell data”
Abstract: Single cells are typically typed by clustering into discrete locations in reduced dimensional transcriptome space. Here we introduce Stator, a data-driven method that identifies cell (sub)types and states without relying on cells’ local proximity in transcriptome space. Stator labels the same single cell multiply, not just by type and sub-type, but also by state such as activation, maturity or cell cycle sub-phase, through deriving higher-order gene expression dependencies from a sparse gene-by-cell expression matrix. Stator’s finer resolution is clear from analyses of mouse embryonic brain, and human healthy or diseased liver. Rather than only coarse-scale labels of cell type, Stator further resolves cell types into subtypes, and these subtypes into stages of maturity and/or cell cycle phases, and yet further into portions of these phases. Among cryptically homogeneous embryonic cells, for example, Stator finds 34 distinct radial glia states whose gene expression forecasts their future GABAergic or glutamatergic neuronal fate. Further, Stator’s fine resolution of liver cancer states reveals expression programmes that predict patient survival. We provide Stator as a Nextflow pipeline and ShinyApp.
Bio: Ava Khamseh is a Reader (Associate Professor) in Biomedical AI, at the School of Informatics, University of Edinburgh. She is also affiliated with the Institute of Genetics and Cancer, University of Edinburgh, and UC Berkeley’s Center for Targeted Machine Learning and Causal Inference. Her cross-disciplinary research interests are in the development and application of mathematical statistics and causal machine learning methodologies in molecular biology, genomic medicine and health informatics. She is the Deputy Director of the Centre for Doctoral Training in AI for Biomedical Innovation, University of Edinburgh.