Hi everyone,
Just a simple question.
I have run scanpy.tl.rank_genes_groups to find DEGs between two clusters.
- I have set
method="wilcoxon" - Using shifted log normalized values.
- I have set
tie_correct = False(Also, I have tried it with ‘True’) - I have set
rankby_abs = True
When I checked the resulting dataframe using scanpy.get.rank_genes_groups_df, I got many genes with duplicated scores.
I want to do GSEA with gseapy, using scores. But of course, I get a warning about genes with the same score. So, do you know how to deal with these genes with identical scores? Also, is it a little bit weird to get the same score? Could it be a decimal thing?
Example dataframe for some duplicated scores:
| names | scores | logfoldchanges | pvals | pvals_adj | pct_nz_group |
|---|---|---|---|---|---|
| A530040E14Rik | 4.767420 | 21.592094 | 0.000002 | 0.000006 | 0.005073 |
| 1700048O20Rik | 4.767420 | 21.506248 | 0.000002 | 0.000006 | 0.005073 |
| Jakmip1 | 4.697415 | -0.022416 | 0.000003 | 0.000008 | 0.181096 |
| Nrcam | 4.697415 | 0.745594 | 0.000003 | 0.000008 | 0.018262 |
I appreciate any help you can provide.
Best