Inconsistency of scvi/SOLO in predicting doublets?

I’m using colab to run my analysis, and since the session is restarted every time you logged out, or terminated due to long pause, i have come to realization on how the inconsistency of the doublets prediction is affecting my analysis.

This is most evidenced when I perform clustering, each restart generated different UMAP profile.

Should i implement random seed ??
Or is this normal??? (im newbie in this field)

I would highly appreciate any advice I could get in this forum.

You should run with scvi.settings.seed = 0 at the beginning (see any of our tutorials), but its not enough.

You will only get the exact same UMAPs when comparing 2 runs done after restarting their sessions (under interactive session).
In other words, you might get different UMAPs even if you are running the exact same code but under the same session and even after setting that seed.

Only setting seed + restarting session each time will guarantee reproducible results (of course given you are running with the same logic)

Does this mean i must return the session to a clean slate every time i run scvi??

Would running multiple samples in the same session affect the consistency??

when you need to exactly reproduce your UMAPs and results, yes. And this general rule of thumb is true to any statistical code, not just SCVI.

if not, you should still get similar results, just a bit different due to the random nature in the process.

I didn’t see exactly what you did, but having multiple samples should not the reason.

1 Like

Thank you for your thorough explanation.

It’s unfortunately slightly worse on Colab. There is no guarantee that you get exactly the same plots in Colab. Some additional variation is due to the used GPU (CUDA on two different devices is not deterministic) and Colab sometimes switches the GPU or updates their cuDNN library. I don’t find differences in downstream analysis of results but sometimes in number of Leiden clusters etc. Reproducibility — PyTorch 2.5 documentation gives a short overview.