Hi.
many thanks for the nice tool.
I have combined multiple public scRNA-seq data and used scvi-tool for integration.
scvi.model.SCVI.setup_anndata(adata, layer="counts", batch_key="Project",
categorical_covariate_keys=['Patient','Sex'])
vae = scvi.model.SCVI(adata, n_layers=2, n_latent=30, gene_likelihood="nb")
vae.train(accelerator='gpu')
vae.save('../data/scvi_models/scvi_integration_model_project_sample_sex')
Then, I clustered the data based on scvi latent space
adata.obsm["X_scVI"] = vae.get_latent_representation()
sc.pp.neighbors(adata, use_rep="X_scVI")
sc.tl.umap(adata)
sc.tl.leiden(adata,resolution=0.3)
Now, my aim is to go with downstream analysis such as defining markers, DEGs (using different techniques like MAST in R), infercnvpy, (liana+) cell-cell communication analysis, pertpy (MiloR, Augur, scCODA) etc..
Which normalized counts should I use for that? And how to determine the categorical covariates key if I use get_normalized_expressionz().
best,