Unexplained change in UMAP

A few months ago, I did an analysis, today I rerun the same bit of code :

sc.pp.neighbors(sco, use_rep = 'integrated_scvi')
sc.tl.leiden(sco, resolution=0.70, key_added='cluster_int')
sc.tl.umap(sco)

But the UMAP have changed, not the number of clusters or the number of cells per cluster, just the global look of it and I cannot find any explanations.

Thanks for your help !


Hi,

this is a common problem that is not really in the control of the scanpy developers. It could be that for example your numpy version in your environment changed leading to such issues.

Also see here: leiden and umap not reproducible on different CPUs · Issue #2014 · scverse/scanpy · GitHub

UMAP is an algorithm with randomized initialization step. Thus, you’d expect some variation if you run it multiple times on the same data. One way to keep it the same is to give a number to the seed parameter. If you run the same data with the same seed value, you should get the same resulting umap.

Btw, the two UMAP you showed are pretty consistent with each other. You are probably worrying about the position of cluster 5 in the right or left in this big group of clusters, but that’s minor variation to be expected in different UMAP runs.