With the newest anndata/scanpy releases, what is the recommended scverse workflow to pseudobulk efficiently a large anndata object? I.e. summing the counts for each var for groups of cells, based on some column in obs. In the past I used decoupler, but this can be pretty slow when you have hundreds of groups (e.g. celltype x perturbation). Are there better solutions?
scanpy.get.aggregate
1 Like
I am looking into making this dask-compatible as well. The issue is that the worst-case is pretty bad…looking into it though!
@ilan-gold any progress on Dask-compatible aggregate?
I will add it to our next sprint. It should be doable, sorry for the delay here
1 Like