Hi,
I am trying to integrate two datasets, and I have tested several methods, including scvi, BBKKN, scanorama, combat and some others in R (RCCA, and Harmony). In some cases I see weird integration output such as the one below. Here the code that I used for this specific integration in combat
# create a new object with lognormalized counts
adata_combat = sc.AnnData(X=concatenated_anndata.raw.X, var=concatenated_anndata.raw.var, obs = concatenated_anndata.obs)
# first store the raw data
adata_combat.raw = adata_combat
# run combat
sc.pp.combat(adata_combat, key='dataset')
sc.pp.highly_variable_genes(adata_combat)
print("Highly variable genes: %d"%sum(adata_combat.var.highly_variable))
sc.pl.highly_variable_genes(adata_combat)
sc.pp.pca(adata_combat, n_comps=30, use_highly_variable=True, svd_solver='arpack')
sc.pp.neighbors(adata_combat)
sc.tl.umap(adata_combat)
fig, axs = plt.subplots(1, 1, figsize=(6,4),constrained_layout=True)
sc.pl.umap(adata_combat, color="dataset", title="Combat umap", ax=axs, show=False)
I am just curious if anyone knows what these distortions are called and what are they due to?