Novel Model for Clustering Developmental Data (starting from scANVI)

Hi there, amazing work on developing such a useful model and spinning it into a cool suite of tools.

So I’ve been working on a generative model to cluster developmental data. The idea is essentially to assume that cell states are gaussian in latent space, however to also allow linear transitions between the cluster centers, modelling a developmental trajectory.

I built the model starting from the amazing simplified scANVI Pyro example, replacing z1 and nonlinear y and generating the latent space z from gaussians (y) and a [0-1] point between two cluster assignments (psi).

I’ve put a notebook with my model here:
https://github.com/mtvector/what-cells/blob/main/scANVI_GMMGP1.1.2_Enum.ipynb

Also I started from a basic GMM clustering model here:
https://github.com/mtvector/what-cells/blob/main/scANVI_GMMGP1.1.1_BasicGmm.ipynb

I guess I’m wondering if any of you have thought about similar things or are interested in helping/discussing? I’ve also run into some issues with the clustering being very close but not quite as good as leiden (based on my knowledge of the truth of the system from extensive previous analysis), and so I need to improve the somehow. Also it doesn’t learn the distribution of values for psi I would expect…

Thanks so much!

Matthew Schmitz
PhD Student, UCSF

1 Like