Random umap and clustering resullt

Yan · October 21, 2022, 10:02am

Hello everyone, i meet a problem that i got different umap and louvain on two machines with same script , same data, same softwares.

Also ,very occasionally, repeatly run pca, neighbors,louvain on same adata got different results.

galicae · October 24, 2022, 10:37am

Hi Yan!

AFAIK all of these methods, except for neighbors, have a random element. PCA in particular will not return identical components if run twice. This means that the downstream calculations will be subtly different, although the coarse result should be the same (unless you have different preprocessing too?). Setting a random state could be a way to solve this, by forcing Python to make the same “random” choices (see here for a more in-depth explanation and here for more context.).

Yan · October 24, 2022, 12:40pm

Thanks galicae, i set the random_state as 0 by default but still get different results on two workstations

I think i need to check whether their python environments are identical.

I use embedded python packages, i wonder same version modules are imported during runtime, Because one has another install python env.

galicae · October 24, 2022, 1:18pm

yeah, that could be another reason. Overall, if it’s so important that you have exactly the same object, it might be worth the effort of just copying a master version over.

Valentine_Svensson · October 25, 2022, 5:29pm

I think it is also possible multiple packages will require you to set the random seed for each of them independently. For example, the umap package has its own random_state parameter: UMAP Reproducibility — umap 0.5 documentation

/Valentine

redvoidling · April 7, 2023, 11:41am

It has something to do with the PCA function under the hood of scanpy which is from sklearn i think.
If you set the solver to ‘full’ then it is reproducible.
I think the random state is not the issue because it is for almost all of the scanpy functions set to ‘0’.

Topic		Replies	Views
Unexplained change in UMAP scanpy	3	68	May 27, 2025
scGen generate irreproducible output scvi-tools	6	357	July 24, 2023
Inconsistency of scvi/SOLO in predicting doublets? scvi-tools	5	37	November 7, 2024
Leiden clustering gives me different results when I run it scanpy	1	434	April 1, 2024
Umap results not comparable Help	2	365	February 2, 2023

Random umap and clustering resullt

Related topics