Hi All,
Thanks for such quick responses! Apologies, I don’t think I was clear in my original post. I want to go in the “opposite” direction, to get gene info when the gene is no longer in the var/var_names.
Use case: After standard processing, I want to explore expression using multiple visualizations (e.g. sc.pl.umap, sc.pl.dotplot, etc) for genes that may have been removed during typical processing steps (e.g. sc.pl.umap(adata, color=[‘FAM138A’]) in example below). I want to keep the reduced dimensional cell embeddings and other metadata from the original adata workflow, while also viewing a gene that was originally in the count matrix but no longer exists.
Since the original cell x counts matrix is stored in adata.raw, it seems like there would be functionality to pull a gene(s) from the original matrix back into a new layer for future use.
Using the pbmc3k dataset an an example:
import scanpy as sc
adata = sc.datasets.pbmc3k()
adata.X.shape
# (2700, 32738)
'FAM138A' in adata.var_names
# True
adata.layers["raw_counts"] = adata.X.copy()
sc.pp.normalize_total(adata)
adata.layers["normalized"] = adata.X.copy()
sc.pp.log1p(adata)
adata.layers["log_norm"] = adata.X.copy()
sc.pp.highly_variable_genes(adata, subset=True)
sc.pp.scale(adata)
sc.tl.pca(adata)
sc.pp.neighbors(adata)
sc.tl.umap(adata)
sc.tl.louvain(adata, flavor = 'igraph')
adata.X.shape
# (2700, 1870)
'FAM138A' in adata.var_names
# False
sc.pl.umap(adata, color=['NKG7'], layer="raw_counts") # Works fine
sc.pl.umap(adata, color=['NKG7'], layer="log_norm") # Works fine
sc.pl.umap(adata, color=['NKG7'], layer=None) # Works fine
sc.pl.umap(adata, color=['FAM138A'], layer="raw_counts") # KeyError: 'Could not find key FAM138A in .var_names or .obs.columns.'
sc.pl.umap(adata, color=['FAM138A'], layer="log_norm") # KeyError: 'Could not find key FAM138A in .var_names or .obs.columns.'
sc.pl.umap(adata, color=['FAM138A'], layer=None) # KeyError: 'Could not find key FAM138A in .var_names or .obs.columns.'
So, I guess the question is something like: “given the adata above with adata.X.shape of (2700, 1870) lacking ‘FAM138A’ but present in adata.raw, what is the easiest/most efficient way to view ‘FAM138A’ in sc.pl.umap?”