Plotting scanpy.tl.score_genes() value using scanpy.pl.stacked_violin()

Hi,

I’m interested in using scanpy.pl.stacked_violin() to plot multiple gene scores sc.tl.score_genes(). As far as I can tell scanpy.pl.stacked_violin() only takes var inputs and groups by an obs. Is there a way to do obs x obs with stacked_violin or is there some technical reason I would not want to do this?

Thanks!

Hi @seanb80

As you mentioned, the way sc.pl.stacked_violin works is to check for var_names so you cannot use values stored in obs. One way to overcome this would be to create a new AnnData object where in .X you would store the obtained scores (extracted from .obs of the original adata object), with dimensions n_cells x n_genesets. This way you can plug this directly into scanpy plotting functions.

Just as an alternative example, this is how its done in decoupler:

import scanpy as sc
import pandas as pd
import decoupler as dc

# Load data
adata = sc.datasets.pbmc3k_processed()

# Define genesets
net = pd.DataFrame([
    ['gset B cells', 'CD74'],
    ['gset B cells', 'HLA-DRA'],
    ['gset B cells', 'HLA-DPB1'],
    ['gset T cells', 'CD3D'],
    ['gset T cells', 'CD3E'],
    ['gset T cells', 'IL32'],
    ], columns=['source', 'target']
)

# Run enrichment (equivalent of sc.tl.score_genes)
dc.run_ulm(adata, net, weight=None, min_n=0)

# Create new AnnData with geneset scores in .X
acts = dc.get_acts(adata, 'ulm_estimate')

# Plot UMAP scores
sc.pl.umap(acts, color=acts.var_names)

# Plot violin
sc.pl.stacked_violin(acts, var_names=acts.var_names, groupby='louvain', standard_scale='var')

image

More information here if interested: Functional enrichment of biological terms — decoupler 1.2.1 documentation

2 Likes