Computing sc.pp.neighbors without method?

Hello ! I’m new here so not sure if i’m doing this correctly but here we go. I’m building a pipeline to deal with anndata objects (for analysis, visualization etc.)

The input of my pipeline is actually the output of an encoder, of a VAE. Therefore, I already reduced the dimensions of my dataset.

Once i’ve created an anndata object with the encoder’s output I want to use sc.pp.neighbors but without applying any kind of dimensionality reduction’s step. To do so, i’m using adata.X as use_rep but I’m wondering about the parameter method = umap. Is this method going to ‘transform’ my ‘raw’ datas ?
Documentation says it’s an optional parameter but i’m not sure of what sc.pp.neighbors is doing without the method parameter.

sc.pp.neighbors(adata,
n_neighbors=15,
n_pcs=None,
use_rep= X,
knn=True,
random_state=0,
method=‘umap’,
metric=‘euclidean’,
key_added=None,
copy=False)

Also, to be able to run metrics such as Silhouette score, I needed an embedding to be store in adata.obsm. And in the same purpose of not computing any other dimensionality reduction step, I wrote this : adata.obsm[‘embeddings’] = adata.X
Does it make any sense ? Like this I wanted to make sure that everything is computed on the input datas far away from PCA / UMAP etc.

Thank you ! :slight_smile:

I’m actually interested in something similar so I can provide custom distances and connectivities to the anndata objects. Did you get any insight on this?

It doesn’t have to do anything with UMAP dimensionality reduction. Rather it is using a very efficient nearest neighbor algorithm that is coming from the UMAP package. So no worries. I would not copy the raw expression into obsm. This is rather bad practice as obsm is not meant for sparse data and you are wasting disk space.