Subsetting Anndata based on Multiple Marker Gene Expression Thresholds

kparakul · July 12, 2022, 1:35pm

I am currently trying to subset my anndata based on the expression of any one of multiple marker genes. However I am having two separate issues in being able to do so.

Firstly, I noticed in a prior post someone mentioned that if one wanted to subset the anndata based on a particular gene that they should use:

adata = adata[adata[: , ‘A’].X > .5, :]

However, if I wanted to obtain all anndata where the mean expression threshold was greater than .5 for any genes within a set of genes, how would I do this?

With this, even for a singular gene I am getting a key error for certain genes even when this gene is present. For example when I do this adata = adata[adata[: , ‘Syn1’].X > .5, :] I get a key error for Syn1.

But when I perform a dot plot of the anndata with Syn1 I show that the anndata still has expression profiles for this gene meaning that this gene does exist but for some reason this error is showing up.

This same issue is occurring for some genes like ‘Syn1’ and ‘Slc17a6’ but not for others like ‘Olig2’ and ‘P2ry12’, however, as mentioned before the dotplots for all of these genes are showing expression profiles so their key should exist regardless.

I also made sure I was not somehow filtering out these genes by accident in an earlier step.

As such, I was wondering if I could subset the anndata if any of the gene expressions within a set of genes is above a certain threshold i.e. .5, and if there was any reason why this key error issue was occurring for certain genes even when they showed gene expression profiles via the dotplot.

Topic		Replies	Views
Subsetting anndata using genelist anndata	4	4073	May 5, 2024
Filtering out a subset of genes in adata anndata	3	687	July 31, 2024
Problems subsetting scanpy	0	547	March 27, 2023
How to isolate individual gene expression values within a cluster? anndata	3	1240	September 14, 2022
Adata gene intersection problem anndata	0	341	January 2, 2023

Subsetting Anndata based on Multiple Marker Gene Expression Thresholds

Related topics