Hi everyone,
Just a simple question.
I have run scanpy.tl.rank_genes_groups
to find DEGs between two clusters.
- I have set
method="wilcoxon"
- Using shifted log normalized values.
- I have set
tie_correct = False
(Also, I have tried it with ‘True’) - I have set
rankby_abs = True
When I checked the resulting dataframe using scanpy.get.rank_genes_groups_df
, I got many genes with duplicated scores.
I want to do GSEA with gseapy
, using scores. But of course, I get a warning about genes with the same score. So, do you know how to deal with these genes with identical scores? Also, is it a little bit weird to get the same score? Could it be a decimal thing?
Example dataframe for some duplicated scores:
names | scores | logfoldchanges | pvals | pvals_adj | pct_nz_group |
---|---|---|---|---|---|
A530040E14Rik | 4.767420 | 21.592094 | 0.000002 | 0.000006 | 0.005073 |
1700048O20Rik | 4.767420 | 21.506248 | 0.000002 | 0.000006 | 0.005073 |
Jakmip1 | 4.697415 | -0.022416 | 0.000003 | 0.000008 | 0.181096 |
Nrcam | 4.697415 | 0.745594 | 0.000003 | 0.000008 | 0.018262 |
I appreciate any help you can provide.
Best