I’m wondering if anyone has had success in predicting cell type labels from an in vivo reference dataset to an in vitro model of that system, which in theory should contain a subset of the cell types found in the reference, albeit with some non-trivial transcriptomic profile differences.
I’ve tried this a few times, but I have never seen the two datasets integrate, and my query (in vitro) cells always remain labelled as ‘Unknown’.
In this example,
rep2 are two in vitro experiments, while
D59_fetal is the reference, and all cells in the reference are labelled going in. All cells in
rep2 were Unknown going in.
can you add your code so that we can see how you ran the model?
It may simply be that your in vitro data is too different from the primary in vivo data for the cells to match up. In vitro models of human organs can be quite distinct from eachother except for a few key pathways.
Imagine if you’d try to transfer labels from PBMCs to the (I assume) retinal cells here. The cells would be too different from each other and won’t match up.
My recommendation would be to look at expression of classical markers for retinal ganglion cells and photo receptors with the
.get_normalized_expression() method on your fitted model.
If you find that the expression levels of those few genes line up between the in vitro cells and the in vivo cells, you can learn why the model thinks the cells are different by selecting an expression threshold on e.g. SNCG for RGCs to define SNCG+ cells. Then you can run the
.differential_expression() method on only the ‘SNCG+ cells’ comparing in vitro vs in vivo. The result will tell you why the model is failing to align in vitro cells with in vivo cells.