Batch key and categorical variables for get_normalized_expression()

Hi.

many thanks for the nice tool.
I have combined multiple public scRNA-seq data and used scvi-tool for integration.

scvi.model.SCVI.setup_anndata(adata, layer="counts", batch_key="Project",
                              categorical_covariate_keys=['Patient','Sex'])
vae = scvi.model.SCVI(adata, n_layers=2, n_latent=30, gene_likelihood="nb")
vae.train(accelerator='gpu')
vae.save('../data/scvi_models/scvi_integration_model_project_sample_sex')

Then, I clustered the data based on scvi latent space

adata.obsm["X_scVI"] = vae.get_latent_representation()
sc.pp.neighbors(adata, use_rep="X_scVI")
sc.tl.umap(adata)
sc.tl.leiden(adata,resolution=0.3)

Now, my aim is to go with downstream analysis such as defining markers, DEGs (using different techniques like MAST in R), infercnvpy, (liana+) cell-cell communication analysis, pertpy (MiloR, Augur, scCODA) etc..
Which normalized counts should I use for that? And how to determine the categorical covariates key if I use get_normalized_expressionz().

best,

Hi. For none of the mentioned tools you would want to use normalized expression.

Thank you for your answer. But, MAST, cellchat, nichenet they all need normalized counts …
Anyway, my question is, in order to get the batch-corrected counts, I should set the transform_batch parameter in vae.get_normalized_expression. Is it ok to use the largest batch I have in the data? Otherwise, I am not sure how to choose the “preferable” batch! thanks in advance