How should I process the data after removing low-quality cell clusters?

XJY · November 9, 2025, 7:00pm

I have already completed the standard single-cell analysis workflow, but I found that one cluster consisted of low-quality cells. I want to remove this cluster, and an AI suggested that I should rerun the preprocessing steps because removing a cluster would affect the identification of highly variable genes and the computation of distances.

I also want to connect this with subclustering. Based on several tutorials, I plan to do the following:
raw_adata = adata.raw.to_adata()
adata_subset = raw_adata[raw_adata.obs['leiden'] != 'CD4 T'].copy()

sc.pl.highly_variable_genes(adata_subset)
sc.pp.scale(adata_subset, max_value=10)
sc.tl.pca(adata_subset)
sce.pp.harmony_integrate(adata_subset, 'project')
sc.pp.neighbors(adata_subset, n_neighbors=20, n_pcs=15, use_rep='X_pca_harmony')
sc.tl.umap(adata_subset)
sc.tl.leiden(adata_subset, resolution=0.2)
sc.tl.rank_genes_groups(adata_subset, 
                        groupby="leiden_0.2", 
                        method="wilcoxon")
The AI also mentioned that when running rank_genes_groups, I should add the use_raw parameter.

Would this workflow correctly achieve the purpose of reanalyzing the data after removing the low-quality cell cluster?Should I proceed directly from identifying highly variable genes, or should I redo the normalization and log-transformation steps?

Topic		Replies	Views
Clustering subsets of cells scvi-tools scvi , clustering	3	1408	November 15, 2021
Re-Clustering Clusters of Anndata scanpy	2	3775	November 8, 2022
How to subset anndata variables, but still store the removed variables elsewhere for downstream analysis? anndata	1	186	July 23, 2024
Subset/subcluster and reprocess scRNA-seq	0	1123	August 3, 2022
Re-integrate after removing low-quality cells? scvi-tools scvi	2	591	January 23, 2023

How should I process the data after removing low-quality cell clusters?

Related topics