Hello everybody,
I am new to single cells and differential analysis, so I am sorry if this question may sound stupid. However, I used scvi-tools to perform a differential expression analysis on the same cluster but across cells affected by two different conditions. After setting the seed using scvi.settings.seed = 42
, my code looks like:
idx1 = (adata.obs['leiden'] == 0) & (adata.obs['Response'] == 'No')
idx2 = (adata.obs['leiden'] == 0) & (adata.obs['Response'] == 'Yes')
scvi_de_noresp_vs_resp = model.differential_expression(idx1=idx1, idx2=idx2, batch_correction=True)
Then, I focused only on genes for which is_de_fdr_0.05'
is True
(referred to as “significant genes” from now on) . Now, the thing is: if I redo the same analysis, on the same exact data and with the same exact seed, just inverting the order of the idxs, such as:
idx1 = (adata.obs['leiden'] == 0) & (adata.obs['Response'] == 'Yes')
idx2 = (adata.obs['leiden'] == 0) & (adata.obs['Response'] == 'No')
scvi_de_resp_vs_noresp = model.differential_expression(idx1=idx1, idx2=idx2, batch_correction=True)
I obtain a different list of significant genes. This is a bit surprising to me, because, to my understanding, the log fold change should be simmetrical, so ideally the abs value of the log fold change should be the same, just in opposite direction…So, I would expect to find the same list of significant genes, but this is not the case (although most of the significant genes overlap between the two analysis).
Is this behaviour expected? And if yes, why?