Converting view of a backed anndata object into non-view/non-backed

What is the preferred way to load only a subset of cells into an anndata in memory? The workflow I’m thinking of is:

  1. Load anndata object as backed
  2. Based on .obs table, select a subset of genes
  3. Load only that subset into an in-memory anndata

If I subset the backed anndata and then call .copy, because it is backed, the .copy function needs a file to save the results to. However this is unnecessary in my case - I just want to load the subset into memory.

I could do something like:

ad_subset_view = ad[indices]
ad_subset = anndata.AnnData(ad_subset_view.X, obs=ad_subset_view.obs, ....)

But I’m wondering if there is a more canonical way to do this? And I have the same question for mudata objects.

Update:

I found the AnnData.to_memory() function which appears to do this. Could be good to include this in the tutorials when talking about worked with backed anndata.

I don’t see any equivalent in mudata, however, but leveraging the to_memory() in anndata made it easy to write a custom function for this.

3 Likes

Thanks for the update. I’m trying to make a “lazy” AnnData loader from backed files. I think this will be useful.