What is the preferred way to load only a subset of cells into an anndata in memory? The workflow I’m thinking of is:
- Load anndata object as backed
- Based on .obs table, select a subset of genes
- Load only that subset into an in-memory anndata
If I subset the backed anndata and then call .copy
, because it is backed, the .copy
function needs a file to save the results to. However this is unnecessary in my case - I just want to load the subset into memory.
I could do something like:
ad_subset_view = ad[indices]
ad_subset = anndata.AnnData(ad_subset_view.X, obs=ad_subset_view.obs, ....)
But I’m wondering if there is a more canonical way to do this? And I have the same question for mudata objects.