How do I get a list (NOT plot/figure/pdf) of highest expressed genes in scanpy?

Scanpy has a great function for plotting the highest expressed genes.

sc.pl.highest_expr_genes()

How do I get a LIST of these highest expressed genes please?

Hi Felix,

Are you looking for something similar to this?:

Cheers,
Jesko

1 Like

Yes!!

This worked for me with a few modifications shown below.

Thank you! :blush:

    # normalize counts matrix so that each 'cell' (barcode) has counts summing to 1
    adata.X_norm = sc.pp.normalize_total(adata, target_sum=1, inplace=False)['X']
    
    # create new adata.var column contaning mean of each column of adata.X_norm above
    # this is total normalized counts per gene a.k.a. 'mean_total_expression'
    adata.var['mean_expression'] = np.ravel(adata.X_norm.mean(0))
    
    # return pd.DataFrame of n top-ranked genes by mean expression
    x = pd.DataFrame(adata.var.nlargest(n, 'mean_expression')['mean_expression'])


Nice! Glad that worked and thanks for sharing your code, hopefully it’ll help others looking to do the same thing in the future.

1 Like