N_samples ( refers to Monte Carlo sampling for each cell) setting in totalVI

Hi :slight_smile:
I have a question about the n_samples ( refers to Monte Carlo sampling for each cell) setting in totalVI.
In the tutorial ( CITE-seq analysis with totalVI — scvi-tools), n_samples is set to 25 in get_normalized_expression function.
rna_denoised, protein_denoised = model.get_normalized_expression(
n_samples=25, return_mean=True, transform_batch=[“PBMC10k”, “PBMC5k”]
)

However, n_samples is not a parameter in the function ‘differential_expression’
de_df = model.differential_expression(
groupby=“rna_subset:leiden_totalVI”, delta=0.5, batch_correction=True
)
As I understand the differential calculation is based on the denoised data, so I checked the source code from github, and found the n_samples is set as 1 by default
def _expression_for_de():
rna, protein = self.get_normalized_expression(
adata=adata,
indices=indices,
n_samples_overall=n_samples_overall,
transform_batch=transform_batch,
return_numpy=True,
n_samples=1,
batch_size=batch_size,
scale_protein=scale_protein,
sample_protein_mixing=sample_protein_mixing,
include_protein_background=include_protein_background,
)
The source code is located from (scvi-tools/scvi/model/_totalvi.py at 95f2e1d2fa921c6433a04e257d844280ba0c25e5 · scverse/scvi-tools · GitHub)

So, the question is how you recommend n_samples setting? If n_samples=25 works well, can this parameters be passed to the below function
TOTALVI.differential_expression(adata=None , groupby=None , group1=None , group2=None , idx1=None , idx2=None , mode=‘change’ , delta=0.25 , batch_size=None , all_stats=True , batch_correction=False , batchid1=None , batchid2=None , fdr_target=0.05 , silent=False , protein_prior_count=0.1 , scale_protein=False , sample_protein_mixing=False , include_protein_background=False , **kwargs )

Lots of thanks in advance!

Hi. We take mc_samples for differential expression from all cells within group1 and group2. Usually we create enough values for these group comparison to get a good estimate, whereas for the normalized expression you want one value per cell and need more samples. Were you experiencing issues?