Hi,
I have a UMAP that is generally nicely clustered (with Harmony), but there are areas where I get long “tendrils” sticking out of the clusters or just spots of smaller clusters. What do you suggest doing with these? I have been quite stringent in filtering I believe, and I have removed both doublets and ambient RNA. The UMAPS show the cells colored for different samples in the dataset. Would you remove these or try to be even more strict with the filtering? Some of these have a bit higher mito and doublet score than the rest, but being more strict would only remove a few of them.
Thank you!
I don’t think there is a general answer to this question. It highly depends on your data and what you are trying to do. One example: If you are looking into brain tissue samples and you are interested in various neural and glia cell types, small clusters might reflect immune cells and you may choose to discard them because you are interested in a phenotype that you know is unrelated to the immune system.
If you are not sure, I would certainly keep them for the initial analysis and annotate them along with the other clusters. If you find that a small clusters contains markers for two cell types, has no markers unique for it, and is enriched for cells with higher doublet score, you can still decide that you believe those to be duplicates and remove the cluster before you perform any downstream analyses.
Personally, if I cannot easily come up with a one-to-three sentence statement why I belive a cell/cluster should be removed, I prefer to keep it.
I hope this helps you.
Thank you, that is very helpful! I will use your suggestions as a guide 