Gene shows up in dotplot, but it's not present in var_names

Hi I am still relatively new to scanpy. I am having trouble understanding the sc.pl.dotplot output.

When I read in my h5ad file and check if a gene is present, it yields false:

adata = sc.read(file.h5ad)
'CEACAM8' in adata.var_names.to_list() 
False

But later when I try and plot the gene the result is a dotplot showing expression

sc.pl.dotplot(adata, 'CEACAM8', groupby='organ', dendrogram=False, title='CEACAM8 expression', swap_axes = True, use_raw=True)

I am a bit confused as to what is going on here. If the gene of interest isn’t in var_names, why am I getting a dotplot?

Probably because you are using raw=True. In the processing at some point, this was done:

adata.raw = adata.copy()

and then after that genes were subsetted, perhaps keeping only highly variable features. So you do not see the gene in the reduced dataset but it is in the raw object. You can recover the raw AnnData with:

adata_raw = adata.raw.to_adata()

to check that indeed your gene is there. Check also: anndata.AnnData.raw — anndata 0.9.2 documentation

1 Like