Hi All,
I am new to using AnnData, and seem to be struggling with some basic usage– I am using large (>10GB) snRNA-seq files from Allen Brain in .h5ad format. I can only load these as backed Anndata objects (backed=’r’) as my RAM is not sufficient to load them into memory. If I index the backed object to get a View based on metadata, I cannot use functions like .to_memory() or .to_df() on this View without crashing my python kernel. I have monitored memory and it seems to be attempting to load the entire AnnData object into memory during these operations, rather than just the indexed View I am calling (to_memory or to_df)…. I have confirmed this behavior even on absurdly small indexed Views (4 x 10, e.g., just a few cells and genes)
Am I missing something basic here? I want to be able to massively subset small amounts of data from specific cells in these AnnData objects based on metadata, and then conduct routine analysis using pandas, etc… Is this not possible if I can’t load the entire AnnData object to RAM at one point?
Many thanks in advance