Label transfer with SCVI-SCANVI pipeline changes (predicts wrong) labels in ref data

Hi, I am following this tutorial " Integration and label transfer with Tabula Muris" Integration and label transfer with Tabula Muris - scvi-tools on SCVI-tools docs page and everything works fine and I save the predicted labels in the new metadata column

adata.obs[“C_scANVI”] = lvae.predict(adata) #saving predicted labels in new column

However when I check the ref labels, some of them are predicted differently than what it was before I trained SCVI model (when I concatenated with my query data).
I don’t understand why this is?
My understanding is that I am training the SCVI model on ref labelled data and then using SCANVI to transfer the labels on ‘unknown’ labels in query dataset.
Why is it predicting some of the ref data labels wrongly?
Any advise please. Am I doing something wrong here?

As you can see in the screenshot the ref data cell label ‘Cell cycle_TCGGTCTGTGAGAGGG-1_32_1-1’ is changed from Trm-c to CTL-c.

Can you describe how many of the training labels are wrong?

The predict function makes a prediction for each cell, including the reference data, for which by default 90% is train and 10% is a validation set. Either scanvi is getting it wrong because there’s something systematically off and/or there is noise in the training labels.

1 Like

Hi Adam,
SCANVI predicted 35.2% labels wrong in the reference dataset.

Here is some of my code, probably not enough for you to figure out what’s wrong but providing here anyway, just in case if there is any obvious mistake.

adata.layers[“counts”] = adata.X.copy() #preserve count layer
sc.pp.normalize_total(adata, target_sum=1e4)
adata.raw = adata

flavor = ‘seurat_v3’,
n_top_genes=3000, #3000 hvg selected
layer = “counts”,
subset = True)

scvi.model.SCVI.setup_anndata(adata, layer=“counts”,
vae = scvi.model.SCVI(adata, n_layers=4, n_latent=30)

SCVI Model with the following params:
n_hidden: 128, n_latent: 30, n_layers: 4, dropout_rate: 0.1, dispersion: gene,
gene_likelihood: zinb, latent_distribution: normal
Training status: Trained

Transfer of annotations with scANVI

adata.obs[“celltype_scanvi”] = ‘Unknown’
ss2_idx = adata.obs[‘batch’] == “1”
adata.obs[“celltype_scanvi”][ss2_idx] = adata.obs.Ident2[ss2_idx]


lvae = scvi.model.SCANVI.from_scvi_model(vae, “Unknown”,

lvae.train(max_epochs=20, n_samples_per_label=100)

ScanVI Model with the following params:
unlabeled_category: Unknown, n_hidden: 128, n_latent: 30, n_layers: 4, dropout_rate: 0.1,
dispersion: gene, gene_likelihood: zinb
Training status: Trained


Am I missing something here?
Please help!
Thank you.

It’s hard for me to diagnose without understanding what kinds of mistakes it’s making. Is it predicting random or related cell types?

Can you try training the scanvi part for longer?

1 Like

Hi Adam, It’s predicting all related cell types. I can try training scanvi for longer and see if it improves the prediction.
Ideally what percentage of correct prediction I should get?

the accuracy on the labeled data should be near 100%. We should be able to expose the training accuracy in the model history to make this easier to check.

1 Like

Hi Adam, that would be great! thanks.
I have tried increasing max_epochs for SCANVI and it decreased wrong prediction from 35 to 26% and I can try further increase but there seems to be something missing as I am training SCANVI with 36 label types and it only predicts about 18 now. I don’t understand why its omitting half cell types?
I am specifying this option (n_samples_per_label=100) but all ref labels have over 140 cells.
Can I specify specific parameter so it predicts all cell types?

I have 145000 cells in ref dataset with 36 immune cell types (skin cd45+ cells:
My query dataset is also skin cd45+ 102000 cells and I think I should have all 36 cell types present in my dataset.

Is there a sort of pvalue or any kind of significance level for the predicted cell types ?
Thanks !

Hi, passing in soft=True to SCANVI’s predict method returns prediction probabilities from the cell type classifier. However, these shouldn’t be interpreted as p-values or significance levels, nor are they typically well calibrated (i.e. the classifier is confident even when giving wrong predictions).