Hi @cookiemonster,
As a general rule, you should always perform quality control (QC) for each sample individually first. Each sample can be quite different from the rest, requiring different filtering thresholds, for example at the doublet score level. Once you have performed QC at the sample level, you can merge the samples into a single object using the concatenate
method. By the way, I would recommend to add join='outer'
to the concatenation because otherwise you might lose quite some genes (by default is set to inner).
Regarding the n_cells
problem, is this related to your previous question?
If yes, it is very weird that you store this information into the var
attribute of your adata
object, since there it should only store metadata for your features (in this case genes). If what you want is to obtain the number of cells per sample and cell type, you can do it after concatenating your samples instead of doing it before:
adata.obs[["Sample", "louvain"]].value_counts().reset_index()
Since this results in a dataframe that its dimensions are not number of cells x samples (obs
) nor number of cells x genes (var
), if you want to store it you could do it in the uns
attribute:
adata.uns['n_cells'] = adata.obs[["Sample", "louvain"]].value_counts().reset_index()
Hope this is helpful!