With the newest anndata/scanpy releases, what is the recommended scverse workflow to pseudobulk efficiently a large anndata object? I.e. summing the counts for each var for groups of cells, based on some column in obs. In the past I used decoupler, but this can be pretty slow when you have hundreds of groups (e.g. celltype x perturbation). Are there better solutions?
scanpy.get.aggregate
1 Like
I am looking into making this dask-compatible as well. The issue is that the worst-case is pretty bad…looking into it though!