Memory Usage in multiple New Formats

Mariano · April 13, 2025, 11:49pm

Hi,
there are now, multiple ways of loading an anndata object. I will list some: read_h5ad, read_h5ad(backed=‘r’), read_zarr, read_lazy (in the experimental branch). I am having a difficult time to understand the difference between ‘backed’ and read_lazy.
Can you give me a comparison of how these methods are using memory when doing the following operations:
1) adata = ad.read_xx(file)
2) subset = adata[:100].copy()

I am looking for some description like: For read_h5ad(), during 1) the entire object is loaded into memory, during 2) a copy is created in which memory is allocated according to the number of observations. At the end of the instructions, memory will have an entire copy of anndata and 100 cells of the data.

Best!

Topic		Replies	Views
Converting view of a backed anndata object into non-view/non-backed anndata	2	449	January 16, 2024
Lazy Loading Anndata backed in Zarr Arrays from Disk anndata	5	92	May 28, 2025
[AnnData] Lazily create .obsm on disk anndata	4	480	May 10, 2022
Converting tab-delimited files to adata in a memory-efficient way scanpy	3	221	February 21, 2024
Ways to write/load h5ad file faster? scanpy	1	1040	March 10, 2023

Memory Usage in multiple New Formats

Related topics