Store data as sparse matrix

The raw data and scaled data are stored in numpy arrays in my anndata object for some reason, how to convert them to sparse data to save the hard disk storage ?


For the raw data you can do this:

from scipy import sparse

sparse_X = sparse.csr_matrix(adata.X)
adata.X = sparse_X

If your transformed data is sparse the same pattern will work. However, transformed data is generally not sparse.


@wangjiawen2013 An aside, but I would also make sure you are compressing your data while saving it. E.g. adata.write_h5ad(..., compression="gzip").

I have never compressed adata, but I’ll try.