Hello,
I’m trying to use AnnData
objects in my Prefect workflows. Prefect uses FastAPI and its jsonable_encoder to pass Python objects to workflows and subworkflows. Unfortunately, I believe my AnnData
objects are not JSON-serializable due to the AnnData.file._adata
field. The _adata
field of this AnnDataFileManager
references the adata
itself, which causes the JSON encoder to traverse cyclically until the stack overflows. When I set adata.file._adata = None
, the adata
is successfully encoded and passed (surprisingly) quickly to the subworkflow.
I am simply reading the AnnData with either sc.read_h5ad
or anndata.read_h5ad
and no arguments aside from the file path. It appears that the adata.file
and adata.file._adata
fields are populated after the read function completes.
The adata.file._adata = None
workaround feels dirty and probably is flawed. Are there options on loading the AnnData
such that it doesn’t carry a field with a reference to itself? Alternatively, is the AnnDataFileManager
rewritable to not carry this _adata
field? It seems like the only usage is in AnnDataFileManager._to_memory_mode
, where instead adata
or adata.X
could be passed as a parameter?
Thanks,
James Gatter
additional info:
Ubuntu 22 on EC2
Python 3.8.16
scanpy 1.9.1
anndata 0.8.0
fastapi 0.89.1
prefect 2.7.10