What input datasets are appropriate for Destvi

Hello Destvi devs,

I found Destvi referenced in some high-profile articles and thought it would be an interesting option for a large visium dataset I am processing. This dataset consists of multiple tissue slices that differ in both timepoint, and whether or not they are diseased or not, however I lack a complete single cell reference to accompany it (the single cell data that I do have with matched time points and disease states is limited to one cell type, unfortunately).

So, I was wondering if I could run destvi with a healthy/steady-state scRNA reference of the tissue which contains all cell types I would expect in the visium dataset and still expect good performance. The Destvi vignette I saw gave me that impression (seems like a refernce dataset was applied to different treatments), but I wanted to also ask directly. Any advice on proper usage would be highly appreciated.

Sincerely,
Dillon Brownell

Generally, I would be careful with those cross-condition comparisons, whether it is appropriate to use healthy as a reference here solely depends on how strong the effect of the disease is.
If the disease is similar to cross-individual differences (like male-female), it will most likely still work. If for the one cell-type where you have the comparison you see strong differences (meaning cell-types in your definition that are condition specific), it doesn’t make sense to run any deconvolution algorithm that relies on accompanying single cell data. Spaceranger in the current versions offers some deconvolution. However, generally those decomposed factors don’t align well with cell-type information.

hi cane11, thank you for your response,

So, let’s say the effect of the disease was strong, in that case would you advise avoiding deconvolution altogether? Basically I am just wondering whether it is possible to perform cell type specific, cross condition DE using a healthy reference. Is this just a bad idea altogether?

-Dillon

I would say in that case it’s altogether a bad idea. Single cell data for deconvolution needs to match the cells expected in a spatial specimen.
Let’s make a plausible example: CD4 cells can express cytotoxic genes in e.g. chronic infection, if those cells were not present in your heathy single cell data but cytotoxic genes were only expressed in CD8 cells there, you would overestimate the number of CD8 T cells.