Can I use the package scib-metrics on methods that don't output an embedding?

Hello,

I am currently testing several tools to integrate my data, and I would like to compare them using the metrics computed by scIB.

I want to compare outputs of Seurat RPCA, Scanorama and scVI.
As far as I understood, while Scanorama and scVI do output a low dimensionnal embedding of the data, Seurat RPCA doesn’t.

I want to use the scib-metrics package to benchmark the different integration, but the documentation seems to suggest that this package currently works only on embedding-based methods :

In the tutorial we find :

Here we run a few embedding-based methods. By focusing on embedding-based methods, we can substantially reduce the runtime of the benchmarking metrics.

In principle, graph-based integration methods can also be benchmarked on some of the metrics that have graph inputs. Future work can explore using graph convolutional networks to embed the graph and then using the embedding-based metrics.

In the scib_metrics.benchmark.Benchmarker function documentation :
**embedding_obsm_keys** – List of obsm keys that contain the embeddings to be benchmarked.

Which means I wouldn’t be able to compare the output of RPCA with the others.
so my main question is : can I use the package scib-metrics on methods that don’t output an embedding ?

The scib method was originally used to compare more integration tools than the embedding based ones. I tried installing the original scib package, but it conflicts with my version of pandas. I’ll try to solve it (I guess I’ll have to setup proper conda environnements), but scib-metrics seemed more straightforward for what I was trying to do.

Any help would be appreciated !

From my understanding, it seems like Seurat RPCA is able to output a lower-dimensional representation of the data, right? If this is the case, you should be able to feed this into scib-metrics just like any embedding method. For reference, by default we compute PCA on the raw counts as a “benchmark” embedding in scib-metrics.

Hello,

Thank you for your answer!

So, I was mistakenly thinking that RPCA was only outputing corrected gene expression, but I was wrong. A new cell embedding is indeed computed by Seurat RPCA.

For anyone interested, in Seurat V5 (5.0.1):
After running:

#R
se <- IntegrateLayers(
  object = se, method = RPCAIntegration,
  orig.reduction = "pca", new.reduction = "integrated.rpca",
  verbose = FALSE
)

They are available in the DimReduc “integrated.rpca” (or any name you might have given it) and the cell embedding part of it can be extracted easily with:

#R
rpca_embed <- Embeddings(se, reduction = "integrated.rpca")