Avoiding AnnData.file._adata self-reference?


I’m trying to use AnnData objects in my Prefect workflows. Prefect uses FastAPI and its jsonable_encoder to pass Python objects to workflows and subworkflows. Unfortunately, I believe my AnnData objects are not JSON-serializable due to the AnnData.file._adata field. The _adata field of this AnnDataFileManager references the adata itself, which causes the JSON encoder to traverse cyclically until the stack overflows. When I set adata.file._adata = None, the adata is successfully encoded and passed (surprisingly) quickly to the subworkflow.

I am simply reading the AnnData with either sc.read_h5ad or anndata.read_h5ad and no arguments aside from the file path. It appears that the adata.file and adata.file._adata fields are populated after the read function completes.

The adata.file._adata = None workaround feels dirty and probably is flawed. Are there options on loading the AnnData such that it doesn’t carry a field with a reference to itself? Alternatively, is the AnnDataFileManager rewritable to not carry this _adata field? It seems like the only usage is in AnnDataFileManager._to_memory_mode, where instead adata or adata.X could be passed as a parameter?

James Gatter

additional info:
Ubuntu 22 on EC2
Python 3.8.16
scanpy 1.9.1
anndata 0.8.0
fastapi 0.89.1
prefect 2.7.10

Followed up on github repo here: Remove AnnDataFileManager._adata by jggatter · Pull Request #899 · scverse/anndata · GitHub