MrVI input and interpretation

cane11 · July 25, 2024, 5:26am

Hi, I’m slightly confused because you ask for abundance but copied the code for expression. Can you clarify? Abundance is meant for pairwise comparisons, similar to tools like milo. What type of output would you expect for more than two groups? For expression, you get DE estimates for the learnt linear model coefficient in latent space, which is also always pairwise in comparison to the reference category similar to tools like classical DE.

Nusob888 · July 25, 2024, 1:28pm

Ah thanks, sorry for the code error.

Yes, agree the abundance is always going to be pairwise, but I was wondering if there is a way to perform this without having to subset the data and regenerated the u embeddings for a subset of the data?

I guess I am thinking of examples where people might generate a multi-disease atlas. It would be a useful feature to have.

on another note, I have been having issues with reproducibility of MRVI described here: Reproducibility issue of MRVI · Issue #105 · YosefLab/mrvi · GitHub

Thanks a bunch in advance

Justin_Hong · July 31, 2024, 3:29pm

Hi @Nusob888, thanks for your question. What do you mean exactly by performing DA without having to subset the data an regenerating the u embeddings? Technically you can get a log density score for every cell and sample then just cache these to compute the final DA comparisons for any given subset of cells. Did you want something like this to be exposed to the user?

Also, aware of your issue. I moved it to the scvi-tools repo since we have deprecated the mrvi repo entirely. I will be investigating it this week.

Nusob888 · July 31, 2024, 7:24pm

That’s amazing thanks.

And I hadn’t realised this, yes I guess something in the API that allows the user to select the conditions they want to do either DA or DE. Currently if I make an MrVI model of let’s say 16 datasets across 6 disease labels and 1 control label, when I run the following code from the quick start tutorial, I get an error.

model.sample_info["Status"] = model.sample_info["Status"].cat.reorder_categories(
    ["Healthy", "Covid"]

I will try to get the exact error message to you when I can, unfortunately our clusters GPU nodes are in demand. But essentially, it cannot reorder two labels from 7. Since we can only do pairwise comparisons, I couldn’t quite see how I can isolate my two labels of interest, without having to re-run a less complex model.

I am probably just being a bit too impatient and the final official release will be much clearer, but it’s giving me too many ideas on how to implement it on a dataset I am working on!

Topic		Replies	Views
Thoughts on a more ~realistic tutorial? scvi-tools tutorials	14	1344	February 26, 2022
Differential expression and highly variable genes scvi-tools	3	1736	October 5, 2022
Interpretation and visualization of MrVI output? scvi-tools	1	269	January 6, 2024
Batch Integration Parameter Tuning scvi-tools integration , gene-selection , scvi , modeling	1	630	March 2, 2022
All genes or highly variable genes? scvi-tools gene-selection , scvi , totalvi	10	3553	March 31, 2022

MrVI input and interpretation

Related topics