How can I operate on view of Anndata object when the Object itself is large to load to memory?

MooseGoblin89 · August 29, 2025, 9:12pm

Hi All,

I am new to using AnnData, and seem to be struggling with some basic usage– I am using large (>10GB) snRNA-seq files from Allen Brain in .h5ad format. I can only load these as backed Anndata objects (backed=’r’) as my RAM is not sufficient to load them into memory. If I index the backed object to get a View based on metadata, I cannot use functions like .to_memory() or .to_df() on this View without crashing my python kernel. I have monitored memory and it seems to be attempting to load the entire AnnData object into memory during these operations, rather than just the indexed View I am calling (to_memory or to_df)…. I have confirmed this behavior even on absurdly small indexed Views (4 x 10, e.g., just a few cells and genes)

Am I missing something basic here? I want to be able to massively subset small amounts of data from specific cells in these AnnData objects based on metadata, and then conduct routine analysis using pandas, etc… Is this not possible if I can’t load the entire AnnData object to RAM at one point?

Many thanks in advance

ilan-gold · September 2, 2025, 10:14am

Hello,

I am happy to look into this behavior a bit if you can open an issue. That should not be happening in theory.

However, we have new APIs to handle this - please have a look at anndata.experimental.read_elem_lazy — anndata 0.13.0.dev24+g3f831435e documentation and anndata.experimental.read_lazy — anndata 0.13.0.dev24+g3f831435e documentation which rely on dask/xarray and can handle obs and var as well.

And have a look at our notebook for a tutorial: Lazily Accessing Remotely Stored Data — anndata 0.13.0.dev24+g3f831435e documentation

However, this behavior sounds buggy so it would be great if you could open an issue so I can handle it when I’m back in the office Also feel free to submit a fix

Topic		Replies	Views
Memory Usage in multiple New Formats anndata	2	111	September 2, 2025
Converting view of a backed anndata object into non-view/non-backed anndata	2	542	January 16, 2024
Lazy Loading Anndata backed in Zarr Arrays from Disk anndata	5	305	May 28, 2025
Subsetting anndata is causing error problems scanpy	3	791	May 5, 2024
[AnnData] Lazily create .obsm on disk anndata	4	541	May 10, 2022

How can I operate on view of Anndata object when the Object itself is large to load to memory?

Related topics