Scanpy.tl.rank_genes_groups, layer= does not appear to be working

CaveCrack · December 30, 2022, 7:54pm

Hi, I’m trying to use a layer in scanpy.tl.rank_genes_groups but scanpy seems to just use adata.X.

I have previously run scVI and obtained the log1p of its normalized counts and have stored them in layers:

AnnData object with n_obs × n_vars = 80642 × 641

layers: 'counts', 'scVI_normalized', 'scVI_normalized_log1p'

Here are the ranges of my layers:

Layers:                                                       min         max
   X                                                         0.00    33241.00
   counts                                                    0.00    33241.00
   scVI_normalized                                           0.00    20319.66
   scVI_normalized_log                                       0.00        9.92

I do the following:

sc.tl.dendrogram(adata, groupby='consensus_clusters', use_rep="X_scVI")

sc.tl.rank_genes_groups(adata, 'consensus_clusters', method='wilcoxon', use_raw=False, layer='scVI_normalized_log1p',
                        key_added = "wilcoxon_consensus_clusters")

When I try to display the data with sc.pl.rank_genes_groups_heatmap:

sc.pl.rank_genes_groups_heatmap(adata, n_genes=8, 
                                key="wilcoxon_consensus_clusters", groupby="consensus_clusters", 
                                show_gene_labels=True)

the range of the gene values on the heatmap is over 30000, so it appears to be using adata.X rather than adata.layers['scVI_normalized_log1p']

Am I missing something?

I am on scanpy 1.8.2 which is the latest installed with conda from conda-forge

thanks for any help

CaveCrack · December 31, 2022, 1:06am

If I force adata.X to contain the data in the layer I want:

adata.X = adata.layers['scVI_normalized_log1p'].copy()

and repeat the calls:

sc.tl.rank_genes_groups(adata, 'consensus_clusters', method='wilcoxon', use_raw=False, layer='scVI_normalized_log1p',
                        key_added = "wilcoxon_consensus_clusters")

sc.pl.rank_genes_groups_heatmap(adata, n_genes=8, 
                                key="wilcoxon_consensus_clusters", groupby="consensus_clusters", 
                                show_gene_labels=True)

I get the correct range, i.e. things behave as I would have expected if the layer parameter was being respected.

Topic		Replies	Views
Scanpy.rank_genes_groups after pp.regress_out Help	0	37	February 27, 2025
Difference between tl.rank_genes_groups and pl.rank_genes_groups_heatmap scanpy	0	119	November 20, 2024
Scanpy.pl.rank_genes_groups_dotplot doesn't plot gene symbols in adata.var_names scanpy	0	118	September 30, 2024
Guidelines to use scanpy layers scanpy	3	4003	February 2, 2023
Can’t change anndata dimensions anndata	6	2040	March 9, 2023

Scanpy.tl.rank_genes_groups, layer= does not appear to be working

Related topics