March 7, 2023, 9:02am
I have questions about the scanpy foldchange computations.
I stumbled across these two issues, which point out two severe issues about the foldchange computation and the tl.rank_genes_groups
opened 09:37PM - 07 Oct 19 UTC
I've noticed that the function `rank_genes_groups` calculate logFC differen… tly than seurat.
Thus the equation is
while in Seurat is
I was thus wondering if this was intended, since it leads to different logFC values.
opened 05:29PM - 18 Apr 22 UTC
KeyError Traceback (most recent call last)
Input In [18], in <cell line: 1>()
----> 1, "origin", method="wilcoxon")
2, n_genes=25, sharey=False)
File ~/app/miniconda3/envs/bio/lib/python3.9/site-packages/scanpy/tools/, in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, pts, key_added, copy, method, corr_method, tie_correct, layer, **kwds)
580 adata.uns[key_added] = {}
581 adata.uns[key_added]['params'] = dict(
582 groupby=groupby,
583 reference=reference,
587 corr_method=corr_method,
588 )
--> 590 test_obj = _RankGenes(adata, groups_order, groupby, reference, use_raw, layer, pts)
592 if check_nonnegative_integers(test_obj.X) and method != 'logreg':
593 logg.warning(
594 "It seems you use rank_genes_groups on the raw count data. "
595 "Please logarithmize your data before calling rank_genes_groups."
596 )
File ~/app/miniconda3/envs/bio/lib/python3.9/site-packages/scanpy/tools/, in _RankGenes.__init__(self, adata, groups, groupby, reference, use_raw, layer, comp_pts)
82 def __init__(
83 self,
84 adata,
90 comp_pts=False,
91 ):
---> 93 if 'log1p' in adata.uns_keys() and adata.uns['log1p']['base'] is not None:
94 self.expm1_func = lambda x: np.expm1(x * np.log(adata.uns['log1p']['base']))
95 else:
KeyError: 'base'
Also, I also experienced, that the foldchanges differ drastically compared to the ones calculated by Seurat or MAST.
Will these issue be addressed in future?
March 20, 2023, 9:12am
As far as I understood the main difference between the scanpy and seurat or MAST is in the computation of the mean.
Seurat, MAST using the arithmetic mean.
Scanpy is using the geometric mean.
The scanpy documentation already points out that tools such as MAST are more reliable in this tutorial .
It would be great to point out somewhere in the documentation that it is not possible to directly compare the logFCs computed with scanpy and those from most R packages.
March 22, 2023, 3:20pm
Please also take a look at the best practice chapter on differential expression analysis.